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1 Introducing the lexeme 


It is customary (see for instance Aronoff 1994: 4) to associate the notion of a lexeme with 
Peter H. Matthews (1965, 1972, 1974, 1991).! Matthews (1972: 160-161) contrasts three uses 
of the term word that may be differentiated as follows. 


* The term word may denote a certain type of syntactic constituent. In this sense, 
the term unambiguously designates a kind of Saussurean sign, possibly complex: 
it associates a phonological representation with a meaning. 


I Matthews (1972: 160) himself notes that the use of the word lexeme in this sense originates in Lyons (1963), 
and that his understanding of the lexeme is very close to that of the semanteme in Bally (1944: 287). See 
also Trnka (1949: 28). On the other hand, the use of lexeme in the tradition starting with Matthews has little 
to do with Martinet's lexéme (e.g. Martinet 1960), which designates what in the English-speaking world 
would be called a morpheme with lexical meaning, or a root. 
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e The term word is used to denote the phonological sequence that is the shape of a 
word in the first sense. Matthews coins the term wordform to designate this and 
avoid ambiguity.” 


* The term word often denotes that lexical object dictionaries talk about: an item 
characterized by a stable lexical meaning and a set of syntactic properties, but 
that abstracts away from inflection. This unit is what Matthews calls the lexeme. 


One may illustrate these definitions by saying that the French lexeme vieux ‘old’ is 
associated with four words filling the four cells in its paradigm: M.sc vieux, r.sc vieille, 
M.PL vieux, and F.PL vieilles. To these four words correspond only three wordforms, since 
the m.sc and M.PL are phonologically identical. This characterization of the lexeme is 
deliberately silent on phonology: the lexeme is defined in terms of the syntactic and 
semantic cohesion of a family of words, ignoring phonology. Literature from the 1990s 
was not so prudent, and presented the lexeme as an underspecified sign. The following 
quote is representative of the dominant view: 


Each lexeme can be viewed as a set of properties, which will in some sense be present 
in all occurrences of the lexeme. These crucially include some semantic properties, 
some phonological properties [...], and some syntactic properties. (Zwicky 1992: 333) 


Such a definition is obviously not adequate if one wants to be able to take into account 
the full spectrum of stem allomorphy, including suppletion. In some cases, there is no 
phonological property that is shared by all forms of the lexeme; e.g. there is nothing 
common between the 3sg forms of the French lexeme ALLER 'go' in the imperfect (allait), 
present (va) and future (ira). This example shows that lexemes are ineffable: one can't 
utter a lexeme, but only one of its forms. It also highlights the importance of cleanly 
distinguishing lexemes from their CITATION FORM.’ The French grammatical tradition 
happens to use infinitives as citation forms, and the infinitive of ALLER happens to use 
the al- stem. From this, no conclusion can be drawn as to al- being a more reflective of 
the fundamental phonological identity of that lexeme: if French grammarians had kept 
the Latin tradition of using the present 1sc as a citation form, we would call the lexeme 
VAIS, and the v- stem would seem crucial. 

Because the definition of a lexeme derives from that of an inflectional paradigm (lex- 
emes abstract away from inflection), using the notion commits one to a particular view of 
morphology. It presupposes the existence of a split between inflectional and derivational 


?Lyons (1968) and some more recent authors use phonological word instead of wordform. This is problematic, 
“phonological word" being standardly used to denote a particular type of prosodic constituent, which may 
or may not be coextensive with a wordform. Matthews is explicit on the difference between wordforms and 
phonological words, both in Matthews (1972: 2, 96, 161) and in the second edition of his textbook (Matthews 
1991: 42, 216). Unfortunately, the first edition was somewhat confusing on this particular issue (Matthews 
1974: 32-33, 35). Adding to the confusion, Mel'éuk (1993) and Fradin (2003) use the French term mot-forme 
(litteraly, ^word-form") to denote what Matthews, and after him the whole English-speaking literature, 
simply calls word. 

The unfortunate use of the term lemma in many discussions in psycholinguistics and Natural Language 
Processing rests on such a confusion between lexeme and citation form. 


vi 


Introduction 


morphology (Matthews 1965: 140, note 4; Anderson 1982; Perlmutter 1988). Delineating 
the sets of words instantiating the same lexeme, such as the one shown in (1a), requires 
one to distinguish it from a set of words that merely belong to the same morphological 
family, as the one in (1b). 


(1) a. { vieux ‘old’ msc, vieille ‘old’ r.sc, vieux ‘old’ M.P1, vieilles ‘old’ F.PL } 


b. { vieux ‘old’ M.sc, vieillard ‘old man’ sc, vieillesse ‘old age’ sc } 


As characterised above, the lexeme is a descriptive category. As such it is compati- 
ble with diverse models of morphology, as long as they implement a notion of struc- 
tured paradigms and split morphology. In practice, however, the notion of a lexeme is 
mainly used within theoretical frameworks that adopt a constructive view of morphol- 
ogy (Blevins 2006) and use the lexeme as the pivot of the theory, linking inflection and 
derivation. Following Fradin (2003), we may call this family of frameworks LEXEMIC MOR- 
PHOLOGY, and assume that they rely on the series of key hypotheses in (2). The wording 
is deliberately noncommittal as to how inflection is to be modeled, since proponents of 
lexemic morphology have assumed either Item and Process or Word and Paradigm ap- 
proaches (Hockett 1954). 


(2 a. Atoms of morphological description are SIMPLE LEXEMES. 

b. LEXEME FORMATION RULES predict the possibility of COMPLEX LEXEMES from 
either a single pre-established lexeme (DERIVATION) or a pair of pre-established 
lexemes (COMPOSITION). 

c. Inflectional morphology deduces, for each lexeme, the set of words constitut- 
ing its inflected forms. 


It is noteworthy that such a conception of morphology predates the coining of the 
term lexeme. It is very clearly outlined by Kurytowicz (1945-1949), where theme plays a 
role analogous to lexeme as used by lexemic morphology: 


When we say that lupulus is derived from lupus, or, more precisely, that the theme 
lup-ul- is derived from the theme lup-, this means that the paradigm of lupulus is 
derived from the paradigm of lupus. 


[...] 


The derivation process for lupulus takes the following concrete form: 


lupus — -i -o,-um, -orum, -is, -os ou lup- — (-us, -i, -o, etc.) 
lupulus — -i, -o, -um, -orum, -is, -os lupul- (us, -i, -o, etc.) 


(Kurylowicz 1945-1949: p. 123; my translation) 
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2 Morpheme, lexeme, and the recent history of 
morphology 


The notion of a morpheme is without doubt the most popular theoretical innovation of 
20!^ century morphology.* Although questions about its usefulness were raised from the 
1950s, most notably by Hockett (1954, 1967), Robins (1959), Chomsky (1965) and Matthews 
(1965), morphemic analysis firmly occupied the center of the stage until the 1990s. Ac- 
cordingly, the notion of a lexeme barely figured in discussions of morphology. For exam- 
ple, although he adopts a word-based (vs. morpheme-based) approach of morphology, 
Aronoff (1976) claims in his preface that he has "avoided the term lexeme [instead of 
word] for personal reasons" and used “the term morpheme in the American structuralist 
sense, which means that a morpheme must have phonological substance and cannot be 
simply a unit of meaning". 

In the 1980s, most generative morphologists (Lieber 1981, Williams 1981, Selkirk 1982) 
explicitly reject word-based models and assume that the traditional morpheme is a legit- 
imate unit of analysis (Lieber 2015b). Aronoff (2007) claims that the classical lexicalist 
hypothesis (Chomsky 1970) holds instead that the central basic meaningful constituents 
of language are not morphemes but lexemes. However, even among supporters of the 
lexicalist hypothesis, things are not so clear. Some of them, such as Halle (1973), explic- 
itly adopt a so-called Item-and-Arrangement (IA) model while others, such as Jackendoff 
(1975), adopt a so-called Item-and-Process (IP) model. Hockett (1954) coined these two 
terms JA and IP to refer to two different views of mapping between phonological form 
and morphosyntactic and semantic information. In IA models, complex words are viewed 
as arrangements of lexical and derivational morphemes; in IP models, they are viewed 
as the result of an operation, called a Word Formation Rule (Aronoff 1976), applying to 
a root paired with a set of morphosyntactic features and possibly modifying its phono- 
logical form. In such models, a complex word is not a concatenation of morphemes but 
is considered as a single piece. IA models clearly reject lexemes as a pertinent unity. 
IP models are not so consensual and hesitate between morpheme-based and word (or 
lexeme)-based theory, and some of them continue to involve morphemes. Corbin's po- 
sition illustrates this hesitation. While adopting the lexicalist hypothesis, Corbin (1987) 
never uses the term lexeme: she claims “une morphologie du morphéme (...) ou plus ex- 
actement une morphologie du morphéme-mot” (p. 183) and treats affixes as morphemes 
(p. 285). 

Indeed, “this conflict between morpheme-based and lexeme-based theories has haunted 
generative grammar ever since" (Lieber 2015a). 

The work collected in this volume is representative of the growingly dominant view 
that the lexeme is an unavoidable component of useful morphological descriptions as 
well as theorizing. The high number of French scholars represented in the volume re- 


^ Although the term morpheme was coined by Baudouin de Courtenay in 1895 with a meaning close to the 
contemporary one, its widespread usage with that meaning can be traced back to Bloomfield (1933) and 
his immediate readers. See Anderson (2015) and Blevins (2016) for relevant discussion of the history of the 
morpheme. 
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flects the importance that the notion of a lexeme has played for that community for the 
past twenty years, mostly under the impulsion of Bernard Fradin (1993, 2003), and the 
group of researchers involved in the CNRS cooperation network Groupe de Recherche 
Description et modélisation en morphologie he coordinated between 2000 and 2007. We 
are happy to dedicate this volume to him. 


3 Presentation of the volume 


While the notion of lexeme is in widespread use in contemporary descriptive and theo- 
retical morphology, many questions remain unresolved. Among others: what is exactly 
a lexeme: a theoretical description or an object manipulated by rules? Is the difference 
between lexemes and word-forms as clear as in Matthews’ definition? Are lexemes and 
Lexeme Formation Rules (LFR) always sufficient to explain the formation of lexicon? Do 
LFR always apply to lexemes? 

The twenty papers collected in this volume address the previous questions and some 
others. They are organized in four sections: 


3.1 Lexemes in standard descriptive and theoretical lexeme-based 
morphology 


Three papers centrally deal with this first theme. 

In his atypical but stimulating contribution based on his own intellectual biography, 
Aronoff traces the emergence of lexeme in descriptive and theoretical morphology since 
the 1960's in Generative Grammar. 

In his paper, Boyé focuses on French cardinals and their place in Word and Paradigm 
models. He argues that, like simple French cardinals, complex cardinals are lexemes, and 
that their phonological idiosyncrasies can better be modeled in a morpholexical system 
than in syntax. 

Rainer studies the linguistic history of two keywords of economics and politics, viz. 
CAPITALIST and CAPITALISM, in which semantic change, calques and word formation - 
suffixation, conversion, suffix substitution - interacted in a complex manner. He argues 
that, within a morpheme-based model, it would not be possible to account for this his- 
tory, which, consequently, supports the hypothesis of a lexeme-based conception of the 
word. 


3.2 Lexeme Formation Rules 


Lexeme Formation Rules (LFRs) are the main theme of four contributions. 

Amiot & Tribout deal with the category of outputs of French suffixation(s) in - iste: are 
they basically adjectives, nouns, lexically underspecified or do we need two different suf- 
fixations to account for data-observation? Their proposal is the last one. They consider 
that, categorically and semantically, the French morphological system contains two suf- 
fixations: one of them forms basically professional nouns, the other basically adjective 
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meaning “in relation to (a practice, an ideology, an activity, a behavior)”. They argue 
that, because such properties can apply to humans, these adjective can easily converted 
in nouns. 

In her contribution, Dal addresses the status of French adverbs in -ment. Although they 
are usually considered derivational, she shows that this status is highly questionable. For 
her, neither inputs nor outputs respect undoubtedly constraints imposed by a LFR and 
her conclusion is that they can be regarded as word-forms belonging to the paradigm of 
adjectives. 

Villoing & Deglas focus two morphological patterns in Creole languages based on 
nouns to form verbs: suffixation N-é and parasynthetic verbs dé-N-é. The hypothesis 
is that these two patterns emerged following the reanalysis of converted and prefixed 
French verbs. 

Strictly speaking, clipping of deverbal nouns is not a standard LFR. However, the treat- 
ment proposed in Stichauer’s paper, which applies Fradin & Kerleroux’s (2003) Hypoth- 
esis of a Maximal (Semantic) Specification, conforms to standard conception of LFRs: 
in case of polysemous lexemes, clipping applies to specific semantic features of lexeme- 
bases, and outputs inherit these features, without being synonymous to the full parental 
form. 


3.3 Troubles with lexemes 


Six of the contributions centrally address the issue of the definition of lexeme and its use 
in morphological theories. 

Bonami & Crysman’s contribution reevaluates the role of the lexeme in recent Head- 
Driven Phrase Structure Grammar (HPSG) integrating a truly realisational theory of 
inflection within the HPSG frameword (Bonami & Crysmann 2016). After having distin- 
guished two notions of an abstract lexical object: lexemes, which are characterized in 
terms of their syntax and semantics, and flexemes (Fradin 2003: 159; Fradin & Kerleroux 
2003), which are characterized in terms of their inflectional paradigm, they show how 
the two notions interact to capture various inflectional phenomena, most prominently 
heteroclisis and overabundance. 

Cruz & Stump deal with essence predicates in San Juan Quiahije Chatino: do they 
fall in the domain of morphology or in the domain of syntax? Their conclusion is that, 
even though their structure comprises a predicate base and a nominal component, their 
inflectional morphology differs from that of simple lexemes. 

In his paper on traces of feminine agreement within complex words in Norwegian and 
Istro-Romanian, Enger tries to overcome troubles with lexemes. He combines a modified 
version of the Agreement Hierarchy (Corbett 1979) and grammaticalisation to explain 
what he considers as intra-morphological meaning. 

Kihm examines the realization of the copula in Haitian Creole, suggesting that the 
absence of an overt copula in some contexts should be modeled by postulating an empty 
stem alternant. He outlines a formal account based on Crysmann & Bonami's (2016) 
Information-based Morphology, but extending that framework to the analysis of pe- 
riphrastic inflection. 
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Spencer questioned whether lexemes are abstract representations of properties uni- 
fying a set of inflected word-forms or objects manipulated by rules. Using the archi- 
tecture of his model of lexical relatedness Generalized Paradigm Function Morphology 
(GPFM) (Spencer 2013), he proposes an answer to verb-to-adjective transpositions (par- 
ticiples), which can be seen as lexemes-within-lexemes according to their double status 
of word-forms in relation to verbs, and lexemes in relation to their adjective properties. 
His proposal is that a lexeme is not a theoretical observation but is best regarded as a 
maximally underspecified object, bearing all and only those properties which are not 
predictable from default specification. 

Flexemes are also the central issue of Thornton's paper. After reviewing the develop- 
ment of this notion since Fradin (2003) and Fradin & Kerleroux (2003), she focuses on 
the concept of overabundance in inflectional paradigms and presents data illustrating 
cases in which a single lexeme maps to two distinct flexemes. 


3.4 Troubles with Lexeme Formation Rules 


LFRs are questioned in seven papers. 

In their study on reduplication in Mandarin Chinese where difference between lex- 
emes and word-forms is less apparent than in languages with clear inflection, Basciano 
& Melloni claim that the domain of application of reduplication is below the level of 
the word, or below X° in the standard X-bar approach: for them, in Mandarin Chinese, 
base units do not have a lexical category and should be vague enough to make them 
compatible with nominal, verbal and adjectival meanings. 

Hathout & Namer explore limits of LFRs to explain and predict the formation of the 
lexicon. They confront parasynthetics lexemes, in other words complex lexemes that 
apparently result from simultaneous application of a prefixation and a suffixation, with 
different hypothesis. This recurrent theme leads them to propose the system ParaDis (for: 
Paradigms and Discrepancies). ParaDis is a model particularly useful to analyze, explain 
and predict noncanonical formations (Corbett 2010). It is lexeme-based and combines 
independency of the three dimensions of LFRs (Fradin 2003) and constraints on outputs 
founded on derivational families and derivational series (Hathout 2011, Blevins 2016). 

Giraudo validates this double view of complex words articulating syntagmatic and 
paradigmatic dimensions, from a psycholinguistic perspective. She identifies two levels 
in processing of complex lexemes: the first decomposes complex lexemes into pieces 
called “morcemes”; the second deals with the internal structure of words according to 
LFRs and contains lexemes. Her model poses family clustering as an organizational prin- 
ciple of the mental lexicon. She argues that, during language acquisition, growing of 
family size consecutively continually strengthens links between complex lexemes. 

Montermini is devoted to variation of derivational exponents. Adapting the frame 
developed in Plénat & Roché (2014) and Roché & Plénat (2014, 2016), he argues that 
this variation obeys to the same constraints as those which explain forms of complex 
lexemes. 

Plag, Andreou & Kawaletz tackle a recurrent and central problem with LFRs: poly- 
semy. They rely frame semantics (Barsalou 1992a,b; Lóbner 2013), an approach to lexical 
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semantics based on elaborate structured representations modelling mental representa- 
tions of concepts. They hypothesize that the semantics of a derivational process can be 
described as its potential to perform certain operations on the frames of the bases to 
which they apply. 

Schwarze deals also with the semantic outputs of LFRs. His hypothesis is they are 
semantically underspecified. The model he proposes is multilayered: it comprises four 
layers of representation: phonology, constituent structure, functional feature structure 
and lexical semantics. The meaning of complex words is treated in the framework of two- 
level semantics. It is assumed that LFRs derive underspecified semantic forms, parting 
from which the actual meanings are construed by recourse to conceptual structure. Three 
morphological processes are studied: French é- prefixation, Italian denominal verbs of 
removal, and French noun-to-verb conversion. 

Strnadova addresses the issue of apparent rivalry between French denominal adjec- 
tives and prepositional phrases in de+N where N is the lexeme-base of the adjective (or in 
relation to it). She discusses some motivations explaining the choice between the former 
and the latter strategy, and shows that they usually do not have the same distribution 
and, therefore, are not interchangeable. 
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Part I 


Lexemes in standard descriptive 
and theoretical lexeme-based 
morphology 


Chapter 1 


Morphology and words: A memoir 
Mark Aronoff 


Stony Brook University 


Lexicographers agree with Saussure that the basic units of language are not morphemes but 
words, or more precisely lexemes. Here I describe my early journey from the former to the 
latter, driven by a love of words, a belief that every word has its own properties, and a lack 
of enthusiasm for either phonology or syntax, the only areas available to me as a student. 
The greatest influences on this development were Chomsky’s Remarks on Nominalization, 
in which it was shown that not all morphologically complex words are compositional, and 
research on English word-formation that grew out of the European philological tradition, 
especially the work of Hans Marchand. The combination leads to a panchronic analysis of 
word-formation that remains incompatible with modern linguistic theories. 


Since the end of the nineteenth century, most academic linguistic theories have de- 
scribed the internal structure of words in terms of the concept of the morpheme, a term 
first coined and defined by Baudouin de Courtenay (1895/1972, p. 153): 


that part of a word which is endowed with psychological autonomy and is for the 
very same reason not further divisible. It consequently subsumes such concepts 
as the root (radix), all possible affixes, (suffixes, prefixes), endings which are expo- 
nents of syntactic relationships, and the like. 


This is not the traditional view of lexicographers or lexicologists or, surprising to 
many, Saussure, as Anderson (2015) has reminded us. Since people have written down 
lexicons, these lexicons have been lists of words. The earliest known ordered word list is 
Egyptian and dates from about 1500 BCE (Haring 2015). In the last half century, linguists 
have distinguished different sorts of words. Those that constitute dictionary entries are 
usually called lexemes. Since the theme of this volume is the lexeme, I thought that it 
might be useful to describe my own academic journey from morphemes to lexemes. Cer- 
tainly, when I began this journey, the morpheme, both the term and the notion, seemed 
so modern, so scientific, while the word was out of fashion and undefined. Morphemes 
were, after all, atomic units in a way that words could never be, and if linguistics were 
to have any hope of being a science, it needed atomic units. 

I grew up with morphemes. The structuralist phoneme may have fallen victim to the 
generative weapons of the 1960s, but no one questioned the validity of morphemes at 
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MIT. They were needed to construct the beautiful syntactic war machines that drove all 
before them, beginning with the analysis of English verbs in Syntactic Structures, which 
featured such stunners as the morpheme S, which “is singular for verbs and plural for 
nouns (‘comes’, ‘boys’)” and @, “the morpheme which is singular for nouns and plural 
for verbs, (bo, ‘come’)” (Chomsky 1957: 29, fn. 3). 

Aside from brief mentions here and there in Syntactic Structures and the cogent but 
little noted discussion at the end of Chomsky's other masterwork, Aspects (Chomsky 
1965), by the time I arrived at MIT as a graduate student in 1970 there was no talk of 
morphology; the place was all about phonology and syntax. These two engines, which 
everyone was hard at work constructing, would undoubtedly handle everything in lan- 
guage worth thinking about. My problem was that I very quickly discovered that I had 
little taste for either of the choices, phonology or syntax. It was like having a taste for 
neither poppy seed bagels nor sesame seed bagels, and having no other variety available 
at the best bagel bakery in the world, but still wanting a bagel. This had never happened 
to me before, and not just with bagels. Maybe I should go to another store, but I liked 
the atmosphere in this one a lot and, like the St. Viateur bagel shop, famous to this day 
(www.stviateurbagel.com), it was acknowledged to be the best in the world. 

What I did love was words. I had purchased a copy of the two-volume compact edition 
of the Oxford English Dictionary (OED) as soon as I could scrape together the money to 
buy one, even though reading the microform-formatted pages of the dictionary required 
a magnifying glass. I also owned a copy of Webster’s III. I kept these dictionaries at home, 
not at my desk in the department. Dictionaries and the words they contained were my 
dark secret. Why should I tell anyone I owned them? These dictionaries served no pur- 
pose in our education, where the meanings of individual words were seldom of much 
use, though we did talk a lot about the word classes that were relevant to syntax: raising 
verbs, psych verbs, ditransitive verbs. The only dictionary we ever used in our courses was 
Walker's Rhyming Dictionary, a reverse-alphabetical dictionary of English, first published 
in 1775. Its main value, as Walker had noted in his original preface, was "the informa- 
tion, as to the structure of our language, that might be derived from the juxtaposition 
of words of similar terminations.” Chomsky & Halle had mined it extensively in their 
research for The Sound Pattern of English and it was to prove invaluable in my work on 
English suffixes, though I did not know it at first. 

The 1960's had seen the brief flowering of ordinary language philosophy, whose pro- 
ponents, beginning with the very late Wittgenstein (1953), were most interested in how 
individual everyday words were used, in opposition to the logical project of Wittgen- 
stein's early work. Despite the popularity of such works as Austin (1962) and Searle 
(1969), ordinary language philosophy never went very far, at least in part because its 
proponents never developed more than anecdotal methods of mining the idiosyncratic 
subtleties of usage of individual words. But there was no contradicting the view that 
every word is a mysterious object with its own singular properties, a fact that most of 
my colleagues willfully ignored, in their search for the beautiful generality of rules. The 
question for me was and remains how to balance the two, words and rules. 
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Morris Halle had given a course on morphology in the spring of 1972, in preparation 
for his presentation at the International Congress of Linguists in the summer. Noam 
Chomsky had published a paper on derived nominal two years before, in 1970, which, 
though it was directed at syntacticians, provided a different kind of legitimation for the 
study of the individual words that my beloved dictionaries held. Maybe I could find 
something there, I said to myself with faint hope, though the approach that Halle had 
outlined did not open a clear path for me and I knew that I was not a syntactician, so 
Chomsky’s framework did not appear at first to provide much hope, despite his attention 
to words. 

Beginning in early 1972, I spent close to a year reading everything I could lay my hands 
on that had anything to do with morphology. I started with Bloomfield and the classic 
American Structuralist works of the 1950s that had been collected in Martin Joos’s (1958) 
Readings in Linguistics, almost all of which dealt with inflection. Though I learned a lot, 
I couldn’t find much of anything in that literature to connect with the sort of work that 
was going on in the department or in generative linguistics more broadly at the time. 

In the end, I did find something to study in morphology, though not in generative 
linguistics. I have come back to this topic, English word formation, again and again ever 
since, but only now am I beginning to gain some real grasp of how it works. The seeds of 
my understanding were sown in my earliest work on the topic but they lay dormant for 
decades, until they fell on fertile ground, far outside conventional linguistic tradition. 
And though again I did not come to understand it for decades, word-formation was also 
a fine fit for the Boasian approach that I had learned to love in my first undergraduate 
linguistics training, in which the most interesting generalizations are often emergent, 
rather than following from a theory. Also, the nature of the system in morphology, and 
especially word-formation, is much better suited to someone of my intellectual predilec- 
tions. This is an area of research in which regular patterns can best be understood in 
their interplay with irregular phenomena. I enjoy this kind of play. 

Word-formation and morphology in general had had an odd history within the short 
history of generative linguistics before 1972, generously twenty years. One of the best- 
known early generative works was about word-formation, Robert Lees’s immensely suc- 
cessful Grammar of English Nominalizations (1960). This book, though, despite its title, 
dealt mostly with compounds and not nominalizations, using purely syntactic mecha- 
nisms to derive compounds from sentences, seemingly modeled on the method of Syn- 
tactic Structures.! Lees's book directly inspired very little research on word-formation in 
its wake, though the idea of trying to derive words from syntactic structures has surfaced 
regularly ever since (Marchand 1969, Hale & Keyser 1993, Pesetsky 1995). 

Chomsky’s 1970 “Remarks on nominalization” (henceforth Remarks) echoed Lees’s 
book in title only. It was in fact its complete opposite in spirit, method and conclusions, 
although Chomsky never said so. After all, he owed Lees a great personal debt. Lees had 
played a large role in making Chomsky famous with his (1957) review in Language of 
Chomsky (1957). Remarks injected for the first time into generative circles the observa- 


1Lees’s book went through five printings between 1960 and 1968, extraordinary for a technical monograph 
that was first published as a supplement to a journal and then reissued by a university research center. 
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tion that some linguist units, in this case derived words, are semantically idiosyncratic 
and not derivable in syntax (unless one is willing to give up on the bedrock principle of 
semantic compositionality). Word-formation, it turns out, is centered on the interplay be- 
tween the idiosyncrasies of individual words that Chomsky noted and the regular sorts 
of phenomena that are enshrined in the rules of grammar. 

My first excursion into original morphological research took place in the fall and win- 
ter of 1972-73, a time when I was entirely adrift. I had begun to read widely and desper- 
ately on morphology early in 1972, hoping it might save me from myself, but had not 
yet lit on any phenomenon that held the faintest glimmer of real promise. This is the 
lifelong agony of an academic: the struggle to find something that is both new and of 
sufficient current interest for others to give it more than a passing glance. For some rea- 
son, I embarked on a study of Latinate verbs in English and their derivative nouns and 
adjectives, verbs like permit and repel, and their derivatives: permission and permissive; 
repulsion and repulsive, which contained a Latin prefix followed by a Latin root that did 
not occur independently in English. All the verbs had been borrowed into English and I 
can’t recall for the life of me what led me to study this peculiar class of words. 

What I first noticed about these verbs and their derivatives was that the individual 
roots very nicely determined the forms of the nouns and adjectives from the verb by affix- 
ation. Each individual root such as pel generally set the form of the following noun suffix 
(always -ion after pel). Also, a given root often also had an idiosyncratic form (here puls-) 
before both the noun and adjective suffix: compulsion, compulsive; expulsion, expulsive; 
and so on for all verbs containing this Latinate root. With a very small number of excep- 
tions, the pattern of root and suffix forms was entirely systematic for any given root but 
idiosyncratic to it, and therefore predictable for many hundreds of English verbs, nouns, 
and adjectives. The whole system was also obviously entirely morphological. And best 
of all, no one had noticed it before. I had discovered something new in morphology and 
I quickly outlined my findings in by far the longest paper that I had ever written, almost 
fifty pages, filled with typos, which I completed in April 1973. 

The central results of this first work were entirely empirically driven. I have prized 
empirical findings above all other aspects of research ever since, because these findings 
don’t change with the theoretical wind. The generalizations I found are as true today as 
they were in 1973. In this emphasis on factual generalization I differ from most of my 
linguist colleagues. Of the empirical discoveries that I have made over the years, I am 
proudest of three: this one, the morphome, and the morphological stem. 

It wasn’t long before I realized that Latinate roots presented a fundamental problem 
for standard structural linguistic theories of morphology. All of these theories were - 
and many still are -based on the still unproven assumption that Baudouin de Courtenay 
had first made explicit almost a century before in linguistics, that all complex linguis- 
tic units could be broken down exhaustively into indivisible meaningful units, which 
were reassembled compositionally (in a completely rule-bound manner) to make up ut- 
terances.? The problem was that, although these Latinate roots could not be said to have 


"The idea that morphology and syntax are both compositional was simply assumed from the beginning, 
though it should be noted that Baudouin's work predates Frege's discussion of compositionality. 
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constant meaning, or in some cases any meaning at all that could be generalized over all 
their occurrences, they had constant morphological properties. The English verbs admit, 
commit, emit, omit, permit, remit, submit, transmit, and so on, do not share any com- 
mon meaning. What they do share are the morphological peculiarities of the root mit. 
The classical Latin verb mittere meant ‘send’ and the prefixed Latin verbs to which the 
English verbs are traceable may have had something to do with this meaning in the deep 
historical past of Latin, but even in classical times the prefixed verbs had begun to diverge 
semantically from their base and from each other. What ties them so closely together in 
English is only the structural fact that, without exception, they share the alternant miss 
before the noun suffix -ion and the adjective suffix -ive, and that the form of the noun 
suffix that they take is similarly always -ion, and not -ation or -ition. 

The verb root mit/miss has very consistent, unmistakable, and idiosyncratic morpho- 
logical properties in English today. Unless we choose to disregard them, these properties 
must be part of the morphology of the language. But the root has no meaning, so it can’t 
be a morpheme in the standard sense. How can we make sense of this apparent paradox? 

The answer is found in the empirical observation that formed the core of Chomsky’s 
Remarks: derived words are not always semantically compositional. This observation, 
which Chomsky called the lexicalist hypothesis, is the single greatest legacy of Remarks. 
It is far from original; only its audience is new. Jespersen, for example, writing about 
compound words, had pointed out many times over several decades that the relations 
between the members of a compound are so various as to defy any semantically predic- 
tive analysis. Jespersen concluded that the possible relations between the two members 
of a compound are innumerable: 


Compounds express a relation between two objects or notions, but say nothing of 
the way in which the relation is to be understood. That must be inferred from the 
context or otherwise. Theoretically, this leaves room for a large number of different 
interpretations of one and the same compound [...] On account of all this it is 
difficult to find a satisfactory classification of all the logical relations that may be 
encountered in compounds. In many case the relation is hard to define accurately 
[...] The analysis of the possible sense-relations can never be exhaustive. (Jespersen 
1954: 137-138) 


The purpose of Remarks had been tactical. As Harris (1993) recounts in detail, at the 
time of writing the article, Chomsky was locked in fierce combat with a resurgent group 
of younger colleagues, the generative semanticists, who sought to ground all of syntax 
in semantics. Syntax at the time was assumed to encompass word-formation, though in 
truth almost no work had been done on word-formation besides Lees (1960). Reminding 
everyone in the room that at least some word-formation was not compositional, a purely 
empirical observation, cut the legs out from under generative semantics in a single stroke 
from which the movement never recovered. More importantly, although Chomsky never 
mentioned it and may not have realized it, the demonstration that some complex words 
are not semantically compositional also destroyed Baudouin’s traditional morpheme and 
lent support to Saussure’s sign theory of words. The non-compositional complex words 
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at the core of Remarks lie within the class of what Jespersen (1954) called naked words: 
uninflected words. Complex naked words are formed by derivational morphology and 
compounding. Inflected forms, by contrast, are always compositional, because they real- 
ize cells in the morphosyntactic paradigm of the naked word. Their properties are acci- 
dental, in the traditional grammatical sense of the term, not essential. 

What I had learned from Remarks about compositionality within words, combined 
with my discoveries about meaningless Latinate roots, led me to realize that word-forma- 
tion needed to be studied in a way that was free from Baudouin’s axiom, an axiom that 
had held sway for over a century: that complex words can be broken down exhaustively 
into meaningful morphemes. Although I was entirely unaware of the consequence at the 
time, and remained unaware of it for decades, this discovery freed me to do linguistics 
in the way I loved to, not deductively as I had been taught to do at MIT, following some 
current theory where it led, and not inductively, but by working towards what the great 
Barbara McClintock had called “a feeling for the organism” (Keller 1983). My first two 
years at MIT had taught me that the theory and deduction game held little charm for 
me. Perhaps that’s because I wasn’t very good at it. Working on my own terms made 
me feel better about myself than I had for the entire preceding two years. I could stop 
worrying whether I was as smart as all those other people. It turned out I didn’t have to 
be smart. Common sense was at least as valuable, and much rarer in those circles. 

English had been an exotic object of inquiry for American linguistics from the start. 
The first American Structuralists were anthropological field workers who confined them- 
selves deliberately to the native languages of North America. Only in his very last years 
did Edward Sapir turn to English. Bloomfield discussed English in his Language (1933), 
presumably to engage a broad readership, but in his technical writing he too dealt mostly 
with languages of North America on which he did original fieldwork. Bloomfield’s suc- 
cessors, notably Trager & Lee Smith (1951) did important work on English, but they were 
in a decided minority. 

Generative grammar was different. The vast bulk of research in the first two decades, 
beginning with Chomsky et al. (1956), had been on English. This English bias was espe- 
cially true of generative syntax, whose success was due in no small part to the analyst 
being able to come up with novel sentences on the fly that the grammar could label as ei- 
ther grammatical or ungrammatical. Only a native English speaker could have come up 
with the most important sentence in the history of linguistics, Chomsky’s colorless green 
ideas sleep furiously? Even in generative phonology, whose earliest works, Chomsky 
(1951) on Modern Hebrew and Halle (1959) on Russian had dealt with other languages, 
the high-water mark of this tradition was an analysis of English, The Sound Pattern of En- 
glish. It was therefore not entirely unexpected that I should turn my attention to English 
word formation. Even my earliest excursion into morphology had dealt with English, 
albeit Latin roots that had been borrowed into English. It would be a decade before I 
looked seriously at word-formation in other languages (Aronoff & Sridhar 1984). 

American linguists had not written much about word-formation in the preceding quar- 
ter century. The great Structuralists from Bloomfield to Hockett had done seminal work 


? All the data in the most important American structuralist work on syntax before Syntactic Structures, Wells 
(1947), is from English, except for one small example from Japanese. 
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on morphology. Much of it was collected in Martin Joos’s (1958) Readings in Linguis- 
tics, which I read carefully, along with the chapters on morphology in Bloomfield’s Lan- 
guage (1933). But the Structuralists had dealt almost exclusively with inflection. I could 
find almost nothing on uninflected words. There was Lees’s (1960) monograph, but his 
approach was not useful in a post-Remarks environment, and besides, he mostly dealt 
with compounds. 

The most notable exception of the previous decade had been Karl Zimmer’s mono- 
graph on English negative prefixes (Zimmer 1964). This book opened up an entirely new 
world for me, the tradition of English linguistics. This world had existed for a century 
and more, parallel to the one I inhabited but completely unknown to us, and it was one 
in which the study of word-formation had always occupied an important place. 

English linguistics had emerged in departments of English language and literature, 
where in the 1970s it still retained the connections to philology that most of the rest of 
the field had left behind in the 19'” century. To this day, it is much more rooted in texts 
than other kinds of linguistics, because of its closeness to literature. Much of English 
linguistics was historically oriented, but in a very different way from the comparative 
historical linguistics that lay at the root of modern structural linguistics. Its focus was 
on the linguistic history of a single language, the record of English since its emergence 
as a distinct written language around 800 CE. The connection to philology lay in this 
shared basis of written texts, though philologists were much more literarily oriented. 
People who read Beowulf and Chaucer and Shakespeare had to know something about 
the language these people were writing in and English linguistics served this purpose. 

Every undergraduate English major—and there were many more in those days—had 
to take a course on the history of the English language. For the same reasons, English 
linguistics had sister disciplines in the other major standard European languages and 
language families: French, German, Italian, Spanish, Romance, Scandinavian, etc. As I 
learned much later, the OED was the greatest monument of this tradition of English lin- 
guistics, but much of the best work had been done on the European continent, especially 
in German departments of Anglistik. The best-known exponent of this tradition was a 
Dane, Otto Jespersen. 

Hans Marchand reviewed Zimmer’s monograph in Language in 1966. Marchand had 
fled from Germany to Istanbul in 1934 as a Catholic political refugee with the help of 
his mentor, the Jewish Romance philologist Leo Spitzer. He gradually turned towards 
the study of language rather than literature, remaining in Istanbul until 1953. Marchand 
returned to Germany in 1957, after a stint in the United States, to teach Anglistik at the 
University of Tuebingen. His book, The Categories and Types of Present-Day English Word- 
Formation, published in 1960 and greatly revised in 1969, has remained the authoritative 
description of English word-formation since its first publication. Remarkably, Marchand 
had written most of the book while in internal exile in Turkey in an Anatolian village 
from 1944 to 1945, under threat of repatriation to Germany, which had drafted him into 
the military in absentia in 1944. He had sought unsuccessfully for years to publish this 
early version while still in Turkey. 
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Marchand and Zimmer follow very similar approaches, quite different from that of 
American structural linguistics. They ask what a given derivational affix meant (what 
Zimmer calls its “semantic content”), what it applied to, and what it produced. The prefix 
un- that most occupies Zimmer’s mind, for example, is negative in meaning and derives 
adjectives from adjectives.’ This is all very traditional and in line with the treatment of 
derivational affixes in the OED, which contained entries for derivational affixes from 
the beginning, though not for inflectional affixes. The adjectival negative prefix un- has 
a very extensive entry in OED, with many observations similar to those of Marchand 
and Zimmer, and hundreds of examples (my favorite being unpolicemanly). The OED 
even notes the morphological environments in which a given derivational affix is partic- 
ularly productive, which was of special importance to Zimmer and to my own work. For 
un-, the OED notes that it is especially common with adjectives ending in -able: “In the 
modern period the examples become too numerous for illustration; in addition to those 
entered as main words, those given below will serve as specimens of the freedom with 
which new formations are created.” 

This traditional approach to word-formation provided an intuitively satisfying solu- 
tion to the problem of the morpheme that my work on Latinate roots had uncovered. If 
derivation is not a matter of combining morphemes but of attaching affixes to words, 
then we don’t need all the morpheme components of words to be meaningful and we 
don’t need the internal semantics of words to be compositionally derived from these 
components. All we need is for words to be meaningful. We don’t need to worry about 
morphemes at all, only words and what the derivational affixes do with them. 

This traditional approach circumvented the problem of meaningless morphemes for a 
simple reason: it predated the notion of the morpheme. The earliest citation in OED by far 
for any sense of the word derivation equates it with formation. It comes from Palsgrave’s 
1530 English-language grammar of French, L'esclarcissement de la langue frangoyse, the 
first known grammar of French ever written in any language: “1530 J. Palsgrave Lesclar- 
cissement 68 Derivatyon or formation, that is to saye, substantyves somtyme be fourmed 
of other substantyves.” This has become my favorite citation of the words derivation and 
(word) formation and, though I did not know it at first, it encompasses the claim that 
words are formed from words; my observation that words are formed from words merely 
updates Palsgrave's remark. This claim is the essence ofthe traditional treatment of word- 
formation and it is the motto that I adopted, elevating the observation to a principle.’ 

In my dissertation and subsequent monograph, I took complete credit for the axiom 
that morphology was word-based. Even decades later, when I clarified the terminology 
and called it lexeme-based morphology, I did not provide any direct attribution to the 
tradition of English word-formation studies. My only defense is that neither Marchand 
nor Zimmer ever stated what for them was simply an unspoken assumption. All I did 
was to make this assumption clear as an axiom. I can therefore at least take credit for the 
realization that this was a useful axiom on which to base the analysis of word-formation. 


^ Un- also attaches to verbs and has the sense of undoing the action of the verb. Whether these two are one 
and the same affix has been much discussed (Horn 1984). 

`The idea that words are formed from words may ultimately be traceable to the Greek and Latin grammatical 
traditions, which were entirely word-based, even at the level of inflection (Robins 1959). 
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Notation meant everything in those days. Chomsky & Halle (1968) had even gone so 
far as to extoll the explanatory power of parentheses. My most important task was there- 
fore to create a simple notation in which traditional OED-style generalizations about 
word-formation could be stated in a way that generative linguists might understand. 
This was the word-formation rule (WFR). It bore close resemblance in form to the rewrite 
rules that were standard in generative grammar. A WFR took a word from one of the 
three major lexical categories (Noun, Verb, or Adjective) and mapped it onto a lexical cat- 
egory (the same or another), usually adding an affix, and making another word. The rule 
of un- prefixation, for example, could be written as [X]A — [un-[X]A]A or it could be 
written simply as the output [un-[X]A]A. This notation was transparent and made gen- 
erative linguists, myself included, think that this way of dealing with word-formation 
could be easily assimilated into their way of thinking. The acronym WFR added a nice 
touch. The title of the published version of my dissertation, Word Formation in Genera- 
tive Grammar (Aronoff 1976) was suggested by S. Jay Keyser, the editor of the series of 
which this would be the inaugural monograph. It only served to strengthen the impres- 
sion that I had integrated the study of word-formation into generative grammar. The 
monograph was a great success, thanks in no small part to its title, and most accounts 
treat the book as central to the treatment of morphology and word-formation within 
generative grammar. 

Nothing could be further from the truth. The title of the monograph was deeply decep- 
tive and in agreeing to it I was also deceiving myself. Word formation rules, as conceived 
of and discussed in that monograph, are incompatible with generative grammar or with 
any grammar-based linguistic framework, because, like the tradition they encode, these 
rules cross the synchronic-diachronic boundary that is central to all post-Saussurean 
structural linguistics. I have only recently come to appreciate this fact. I certainly be- 
lieved at the time that I was doing generative grammar, as have most of the book's 
readers since. What is true is that I was a member of a social community self-organized 
around generative grammar. I did my work on word-formation within that community 
and it was accepted as legitimate almost entirely on those social grounds. 

In his great posthumous work, Saussure 1916/1959 set up a distinction that has been 
accepted throughout the field ever since, between synchronic and diachronic linguistics. 
Synchronic linguistics deals with a single state of a language—the present—while di- 
achronic linguistics deals with successive states—history. Generative grammar seeks to 
provide a theory of what is a possible synchronic grammar of a language, the basic 
idea being that the grammar generates the language (Chomsky 1957). The theory is also 
supposed to mirror the innate capacity that a child brings to the task of constructing a 
grammar for the input that the child receives (Chomsky 1965). But traditional research 
on word-formation, which preceded Saussure in its origins, is neither synchronic nor 
diachronic: it is about how new derived words accumulate in a language over time. 
That is why Marchand gave his magnum opus the subtitle ^A Synchronic-Diachronic 
Approach" and why Jespersen called his monumental six-volume life's work A Mod- 
ern English Grammar on Historical Principles, both titles in direct contradiction of the 
Saussurean split, both by scholars working within the tradition of English linguistics. In 
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truth, Marchand’s approach was neither synchronic nor diachronic, in spite of its fash- 
ionable title, because the study of word formation lends itself to neither synchrony nor 
diachrony: the word formation system of the language at any given moment can only 
be understood through the historical accumulation of the lexicon. The study of word- 
formation is concerned at its core with how words are created, how they are formed, 
and how they are added to the language. Unlike sentences, words, once formed, accumu- 
late, and this accumulated storehouse has an effect on new words. Words accumulate 
both in the mental lexicon of an individual speaker and in the collective lexicon of the 
larger linguistic community. 

This brings us back to Chomsky’s lexicalist hypothesis. To understand this hypoth- 
esis, we need to clarify two distinct senses of the word lexical (Aronoff 1988). One is 
Bloomfield’s lexicon, the list of what DiSciullo and Di Sciullo & Williams (1987) later 
so nicely called the “unruly” The other encompasses the word-formation rules them- 
selves and maybe all morphology including inflection too. The term lexical component 
is usually meant to include both the rules of morphology and the lexicon. Chomsky’s 
original lexicalist hypothesis says no more than that the lexical component is responsi- 
ble for forming and storing some of the complex words of the language, in addition to 
the simple monomorphemic words that have always been thought of as arbitrary signs 
stored in the lexicon. His major criterion for distinguishing lexically from 'transforma- 
tionally’ derived words is semantic predictability or compositionality (lexically derived 
words are not compositional) though most later lexicalist theorists used others as well 
(Aronoff 1994, Pesetsky 1995). 

Halle’s (1973) lexicon, which he described as “a special filter through which the words 
have to pass after they have been generated by the word formation rules” (p. 5), is a 
Bloomfieldian list of words, separate from the morphological rules. Halle suggested that 
“the list of morphemes together with the rules of word-formation define the set of poten- 
tial words of the language. It is the filter and the information that is contained therein 
which turn this larger set into the smaller subset of actual words” (p. 6). This way of 
looking at the relation between word-formation and the lexicon appears to permit us 
to include word-formation in a synchronic grammar: the morphemes and the abstract 
rules of word-formation will be part of the grammar, not the lexicon, while the actual 
results of the application of the rules to the morphemes, which can be quite messy and 
idiosyncratic, as Chomsky had already emphasized, will be housed outside the grammar 
in the Bloomfieldian lexicon. Words will be formed by rules in the grammar, just as sen- 
tences are, though perhaps by a distinct lexical component, along the lines of the theory 
of Remarks. On this story, though, once words are formed they are stored in the lexicon 
and should accordingly have no further interaction with the grammar or the rules. 

Over the years, this general strategy of strictly separating the rules from the unruly in 
order to better assimilate word-formation to syntax, what Marantz much later called the 
single engine hypothesis (Marantz 2005) has faced a number of problems, all of which are 
traceable to the fact that the strategy allows for no interaction between the rules (and 
the morphemes they operate on) and the set of words formed by the rules, which are 
stored in the lexicon. The insulation of the rules from the lexicon makes it impossible to 
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ask many interesting questions with even more interesting answers. I will discuss briefly 
here only the two most important ones, morphological productivity and blocking. 

Unlike most rules of syntax, rules of word-formation vary widely in their productivity. 
A standard example is the trio of suffixes -ness, -ity, and -th, all of which form nouns from 
adjectives in English. of the three, -th is the least productive; only a handful of words 
end in this suffix. The only one I can identify as having been added to the language in 
the last couple of centuries is illth, which was coined on purpose by John Ruskin in 1862 
to denote the opposite of wealth. The word is almost never used today, except in close 
proximity to wealth or health. Speakers of English know that new or infrequent words in 
-th have an odd flavor about them. The OED remarks about the word coolth, for example, 
that it is “Now chiefly literary, arch[aic], or humorous.” 

The suffix -ity is more productive, but limited in the morphology of what it can attach 
to. The OED lists approximately 2400 nouns in current use ending in the letter sequence 
<ity>, most of which contain the suffix, compared with about 3600 ending in the letters 
<ness>. But a closer look reveals that <ity> is much more likely to appear after a select 
set of suffixes. With -ic it is preferred by a ratio of almost 7/1 over -ness. This preference is 
reflected in speakers’ judgments and in the relative frequency of members of individual 
pairs. The word automaticity feels much more natural than automaticness and a simple 
Google search shows 109,000 “hits” for automaticity but only 242 for automaticness. Even 
for very rare words, the same pattern emerges. While oceanicity, a word I have never 
heard of, gets only 762 hits, its counterpart, oceanicness, gets only 5! 

Once we leave the few affixes that -ity is attracted to, though, -ness is ascendant. Green- 
ness outnumbers greenity 1000/1. Google even thinks that you have made a mistake when 
you search for greenity and asks: “Did you mean: greenify?” A similar pattern of results 
is found for all the other color words. In the same vein, we can find examples of humor- 
ous uses of words like sillity or slowity in the Urban Dictionary, but not in many other 
places on the Web. 

There are numerous ways of distinguishing the productivity of these three suffixes, 
but productivity is clearly related to the number of words that are already present in the 
language: the more you have, the more you get. Productivity depends on the accumula- 
tion of words. It is a dance between the lexicon and the grammar. If we try to make a 
strict separation between the two, we will never understand how the dance works. Both 
Marchand and Zimmer knew about the nuances of productivity. Marchand closes his 
review of Zimmer’s book with the following somewhat backhanded compliment: “Zim- 
mer’s investigation is a valuable contribution not to the study of semantic universals, 
which it planned to be, but to the problem of productivity in word-formation” (Marc- 
hand 1966: 142). 

The other problem that productivity poses for modern linguistics is that it is vari- 
able. Mainstream formal linguistics, with its roots in the triumphal 19th century neo- 
grammarian slogan that sound change laws have no exceptions (Paul 1880) has never 
dealt well with variation. If anything, formal linguists continue to be blind to the fact 
that variation is a part of language (I-language). One response to variability is simply 
to deny that a phenomenon like productivity exists. Another is to admit that it exists, 
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but to deny that the phenomenon is variable, claiming instead that it is all or none. That 
is what Marchand does. Referring to Harris (1951: 225), Marchand notes disapprovingly 
that “a descriptivist like Zellig S. Harris maintained that ‘the methods of descriptive lin- 
guistics cannot treat of the degree of productivity of elements” (Marchand 1966: 141) . 
But he himself only dichotomizes word-formation rules into those that are productive 
and those that are, in his words, restricted: 


Zimmer’s merit is to have seen an important problem in word-formation, that of 
productivity. ... Zimmer's study . . . calls our attention to the fact that what seems 
to be the same type of combination, viz. derivation by means of a negative prefix, 
is in reality split up into two groups, one of restricted productivity (instanced by 
unkind) and another, deverbal group (instanced by unread) which is of more or less 
unrestricted productivity (Marchand 1966: 141). 


Even here, Marchand is not talking about one productive rule vs. a different unproduc- 
tive rule, but rather a single rule, which is more productive in one environment (with 
past participles and -able derivatives, both of which have a passive reading) and less 
productive in another (with underived adjectives like kind). As Zimmer demonstrates, 
there is not in fact a dichotomy, but rather a cline in productivity that depends on both 
environments and rules. In the half century since, the nondiscrete nature of productivity 
has been demonstrated time and again, most definitively in Bauer (2001). 

Productivity is a question of fecundity, how many words there can be and how easily 
they can be created. A pattern is highly productive if there can be many new words 
in that pattern. It is unproductive if there can be only a few new words. When we say 
that the English nominal suffix -ness is highly productive we mean that the pattern can 
form many nouns from adjectives; when we say that the suffix -th, which also derives 
nouns from adjectives, is unproductive, we mean that it cannot. And because words are 
formed from words, there is a direct relation between how easy it is to form words in 
a pattern and how many already exist in that pattern, in either the mind of a speaker 
or the language of a community. As we have just seen, there are many -ness nouns in 
English. The OED lists over 4000 nouns ending in the letters «ness», the great majority of 
them containing the suffix. There are no more than a handful of -th nouns derived from 
adjectives. If how many words there can be of a given type depends on a combination 
of how many words there are already of this type and how many there are for the type 
to feed on, then words differ sharply from sentences. For starters, it makes little sense to 
even ask how many sentences there are of a given type. Sentences are not stored, they 
are produced and then vanish. 

Blocking is the second phenomenon that demonstrates how the formation of individ- 
ual words depends intimately on the words we already know. For four decades, since 
the moment that I first stumbled on this phenomenon, it has been clear to me that block- 
ing is a real empirical phenomenon and that it is just what I first defined it to be: "the 
nonoccurrence of one form due to the simple existence of another" (Aronoff 1976: 43). 
A few pages later, I made an explicit connection to synonymy: "Blocking is basically 
a constraint against listing synonyms in a given stem" (Aronoff 1976: 55). And on the 
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same page I wrote: “To exclude having two words with the same meaning is to exclude 
synonymy, and that is ill-advised” A few pages later, I referred to “the blocking rule” 
Clearly, I had no idea precisely what blocking was, beyond an empirical phenomenon. 
Only now, though, do I understand why my empirical observation might be true: the 
avoidance of synonymy in general and blocking in particular are the result of competi- 
tion, a topic I have spent the last half decade investigating. 

The tradition of word-based morphology dates to the first grammarians, although it 
was eclipsed for much of the twentieth century by the rise of synchronic linguistics. In 
Cambridge, Massachusetts one didn’t learn much about what was happening in Cam- 
bridge, England, but soon after leaving for Stony Brook I learned that word-based mor- 
phology had been revived in England in the decade or so before my own research, no- 
tably by R. H. Robins (1959) and Peter Matthews (1965, 1972). This line of research, es- 
pecially in derivational morphology, has grown in the decades since, notably in France, 
led by Danielle Corbin (1987), Françoise Kerleroux (1996), and Bernard Fradin (2003). To- 
gether, they created a new thriving research community, of which I am proud to be a 
member. 
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Chapter 2 


Lexemes, categories and paradigms: 
What about cardinals? 


Gilles Boyé 
Université Bordeaux-Montaigne & UMR5263 (CNRS) 


In Word and Paradigm frameworks such as Network Morphology (Corbett & Fraser 1993) 
and Paradigm Function Morphology (Stump 2001), categories and lexemes are taken as 
granted and usually associated with an inflectional paradigm relevant for all the lexemes 
in a given category. In Section 2, we explore the status of French cardinals as lexemes based 
on the characteristic properties defined by Fradin (2003): i) abstraction over form-variation, 
ii) autonomous forms, iii) stable meaning, iv) belonging to a major category, v) open-ended 
set of units that can serve as input and/or output of morphology. We start with the sim- 
ple cardinals and argue, following Saulnier (2008)’s discussion, that French cardinals fit all 
the lexemic criteria but (iv), belonging to a major category, and should be considered full 
lexemes even though they constitute a sub-category of determiner, a minor category in Fra- 
din’s terms. In Section 3, moving from simple cardinals to complex ones, we show that the 
idiosyncratic morphophonological properties of French cardinals plead for a morphological 
analysis rather than a syntactic one, giving an analysis of their construction as multi-layered 
compounds. In Section 4, we describe the inflectional paradigms of French cardinals as de- 
pendent on their rightmost element using the Right Edge mechanism introduced by Miller 
(1992) and Tseng (2003) for other phenomena in French. In the conclusion, we show that 
some complex cardinals have to be analyzed as multi-layered morphological compounds 
due to their morphophonological idiosyncrasies but this does not entail that all complex 
cardinal should be. The fact that syntactic combinations of French cardinals do not respect 
lexical integrity indicates that to some extent, complex cardinals are in the shared custody 
of morphology and syntax. 


1 Introduction 


In this paper, following the lead of Saulnier (2008, 2010), we explore the status of French 
cardinals and their place in Word and Paradigm frameworks, within theories of mor- 
phology focusing on lexemes as their fundamental unit. In general, this topic poses in- 
teresting problems for linguistic theories: 
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* Are they lexemes? To what category do they belong: determiners, nouns, adjec- 
tives? 


* Are they built by syntax or in the lexicon? 
e Is there an inflectional paradigm for cardinals? If so, where does it come from? 


In Section 2, we explore the categorial status of simple cardinals. In Section 3, we argue 
that complex cardinals are lexemes, like simple cardinals, even though they constitute a 
subcategory of determiners.! We outline a syntagmatic analysis to create complex car- 
dinals in morphology as compounds. In the last section, we propose an analysis of the 
inflectional paradigm of cardinals based on the Right Edge mechanism introduced by 
Miller (1992) and Tseng (2003) for other phenomena in French. 


2 French cardinals: Lexemes? 


In this section, we examine the lexical status of French cardinals.” 

Following Fradin (2003: 102), we distinguish two types of atomic units in the lexicon: 
lexemes and grammemes. Lexemes are typically nouns, verbs, adjectives, adverbs, while 
grammemes are grammatical units such as prepositions, determiners, conjunctions. Fra- 
din identifies the following characteristic properties of lexemes: 


(1) a. Itis an abstract unit to which word-forms are related; this unit captures the 
variations across word-forms. 


b. It possesses a phonological representation which gives it prosodic autonomy. 
c. Its meaning is stable and unique. 
d. It belongs to a category and can have an argument structure. 


e. It belongs to an open-ended set and can serve as output and input of 
derivational morphology. 


Whatever the analysis of French complex cardinals such as vingt-et-un '21', simple car- 
dinals like vingt or un are underived and therefore have to be listed in the lexicon. In 
what follows, we argue that simple cardinals in French pattern with lexemes rather than 
grammemes. 

In French, the simple cardinals are the elements listed in (2) that serve as cardinals 
and as building blocks for complex cardinals.? 


(2) un T, deux ‘2’, trois ‘3, quatre ‘4’, cing ‘5’, six ‘6’, sept ‘T, huit ‘8’, neuf ‘9’, 
dix ‘10’, onze ‘11’, douze ‘12’, treize ‘13’, quatorze ‘14’, quinze ‘15’, seize ‘16’, 


IThis does not mean that all determiners are lexemes but rather that cardinals have to be treated as an 
exception. 

?For complex cardinals, see Section 3. 

3The elements million and milliard are not simple cardinals in French; their respective values are realized 
as un million (‘one million’) and un milliard (‘one billion’). They semantically belong to the quantity noun 
series in -aine (see Table 2, p. 23) 
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vingt ‘20’, trente ‘30’, quarante ‘40’, cinquante ‘50’, soixante ‘60’, 
cent ‘100’, mille ‘1,000’ 


Simple cardinals have the properties (1b-c). They can be used as single word answers, 
meaning they have an autonomous phonological representation. They have straightfor- 
ward semantics, denoting counting values. 


2.1 Form variation abstraction 


As for property (1a), while un ‘1’ is the only simple cardinal varying in gender (m: [&] 
un, F: [yn] une), many simple cardinals are subject to liaison (linking), a morphosyntactic 
phenomenon whereby French words can change in form depending on the phonological 
properties of the following word. For example, in (3), the adjective BoN agrees in gender 
and number with the following noun, in both cases masculine and singular. But in a 
liaison context such as prenominally, the form b5 appears in (3a) in front of a word 
starting with a consonant (not a liaison trigger: e) and the form bon appears in (3b) in 
front of a vowel-initial word (a liaison trigger: e). Outside liaison context (o), adjectives 
assume the same form as in liaison context without trigger (o=e).* 


(3) a. un bon collègue 
& bide koleg 
‘a good colleague’ 
b. unbon ami 
ce bone ami 
‘a good friend’ 


c. bon à manger 
b5e a maze 
'ready to eat 


Unlike adjectives, cardinals can have three different forms for the three contexts above.? 


For example, six ‘6’ has different realizations (si, siz, sis) for the three contexts: 


(4) a. inliaison context without a liaison trigger — sie 

Six souris 
sie sui 
‘six mice’ 

b. in liaison context with a liaison trigger — siz e 
six écureuils 
size ekyxoej 
'six squirrels' 

c. notin liaison context — sis o 
six àattraper 
siso a atsape 
'six to catch' 


^For more details about the morpho-syntactic aspects of liaison see Bonami et al. (2004). 
?See Plénat (2008), Plénat & Plénat (2011) and the citations therein for a detailed description. 
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Not all cardinals have different forms in all three contexts. Table 1 gives the five different 
patterns of syncretism found with the simple cardinals. Type A cardinals are not sensi- 
tive to liaison and thus display only one form; in type B the e and the o are identical 
and the e has an additional consonant at the end, while in type C all three forms are 
distinct. In type D, e is overabundant with a long form and a short form, and the long 
form is also used in the two other contexts. Type E is a variant of type B where instead 
of having an additional consonant for e, the final fricative alternates between voiceless 
f and its voiced counterpart v.° 


Table 1: Type of simple cardinal variation according to liaison 


Type Example e e © Cardinals 
A 4 kats kats kats 4, 7, 11, 12, 13, 14, 15, 16, 30, 40, 50, 60, 1000 
B 2 dø doz de 1, 2, 3, 20, 100 
C 6 Si siz sis 6,10 
D 5 sé/stk sek sék 5,8 
E 9 


noef nov nof 9 


The simple cardinals in (2) have an associated form paradigm for liaison, which fit 
Fradin's property (1a). This property is part of the conceptual definition of lexeme; it is 
neither required nor sufficient by itself. Definite determiners which have form paradigms 
in French and German are not considered lexemes, while English adjectives are lexemes 
even though their forms do not vary. 

We turn now to the two remaining properties (1d-e): belonging to an open-ended 
category and participating as the output and potentially the input of derivational mor- 


phology. 


2.2 Morphological input 


In French, simple cardinals clearly serve as input for several morphological derivations 
as summarised in Table 2 below (see Saulnier 2008, Fradin & Saulnier 2009, Saulnier 2010 
for a detailed discussion). 

As bases for the ordinals, simple cardinals are part of a morphological category in 
terms of Van Marle (1985) namely the derivational domain of ordinals, but to satisfy (1d), 
simple cardinals have to belong to a unique morphosyntactic category. 


Din the case of type E, there is also hesitation for the e form between næv and noef as they can both provide 
an onset for the following trigger unlike in type B. 

7While belonging to the same series of nouns designating groups of approximate cardinality, millier (‘thou- 
sand’), million (‘million’), milliard (‘billion’) are derived from mille with different suffixes (-ier, -ion, -iard). 
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Table 2: Some derivations on French cardinals (adapted from Fradin & Saulnier 


2009: 201) 
Suffix Derivation Category 
-iéme deux (2) —> deuxième (‘second’ ordinal) Adj 
-ième | cinq (5) — cinquième (fifth part) Adj/N 
-ain quatre (4) — quatrain (‘quatrain’) N 
-aine douze (12) — douzaine (‘dozen’) N 


-aire trente (30) — trentenaire (‘thirty-year-old’) Adj/N 


2.3 Morphosyntactic category 


Following Saulnier (2010), we consider simple cardinals to be a sub-category of indefinite 
determiners, CARD. 

Saulnier (2010: 31-40) applies the discriminating contexts defined in Leeman (2004) s 
work on French indefinite determiners. She shows that cardinals have the following 
distribution across the six diagnostic contexts. 


(5) en dislocation: + — il a deux solutions = il en a deux 
‘he has 2 solutions = he has 2’ 
only alone before N: - — mes deux livres (*mes plusieurs livres) 
‘my 2 books (“my several books)’ 
following the indefinite: - — “un deux livres (un certain livre) 
“*a 2 books (a certain book)’ 
following the definite: + — les deux livres (“les certains livres) 
‘the 2 books (*the certain books)’ 
followed by the definite: - — “deux les livres (tous les livres) 
“2 the books (all the books)’ 
followed by de NP: + —> deux de mes collégues 
‘2 of my colleagues’ 


With these criteria in mind for the category CARD, it becomes clear that there are sim- 
ple cardinals that were not listed in (2) because they do not participate in the formation 
of complex cardinals. 

Zéro ‘0’, for example, is not a construction unit for complex cardinals but it behaves 
like a CARD in all the contexts in (5). Saulnier (2010: 38) considered zéro to depart from 
the cardinals distribution because she could not find examples for the contexts in (6), 
expecting zéro to be singular.® 


Din the same contexts, Saulnier does not examine un and the surprising plural number that arises when it 
follows a definite or a possessive. For example, in pour ses/son un mois ‘for his one month anniversary’, the 
masculine singular form of the possessive son is far less common than the plural ses; the possessive can 
take its plural form ses despite the presence of the cardinal un ‘1’. 


23 


Gilles Boyé 


(6) Examples from the web (26/12/2016) 
en dislocation: + — Il a des tas d'contacts, des tonnes de numéros pour 
remplir son phone mais des vrais potes il en a zéro? 
only alone before N: - — Et il ne nous restera alors que nos, zéro euros 
d'augmentation pour pouvoir demander un crédit. 
following the indefinite: - —> "un zéro livre/livres.!! 
following the definite: + — Je vote pour les, zéro heures payées 
trente-cinq.!? 
followed by de NP: + — Mais méme les potes des autres viennent ici et zéro 
de mes potes sont venus me voir.” 
And contra Saulnier (2010), zéro also appears in zéro+N subject NPs: 
zéro+N subject: + — Pendant ce temps, zéro personnes, sont mortes de 
surdoses de marijuana.!4 
In derivational morphology, zéro also gives a corresponding ordinal zéroiéme following 
the pattern of other simple cardinals. 


2.4 Morphological output 


Apart from fixed value cardinals, French uses variable cardinals such as n ‘n’ (pronounced 
[en] ) or x ‘x’ (pronounced [iks] ). Like zéro, these variable cardinals do not participate 
in complex cardinal formation but they appear in the contexts in (5) and allow a subset 
of the derivations for fixed value cardinals (e.g. éniéme ‘nth’ pronounced [enjem] and 


xiéme ‘xth’ pronounced [iksjem] ). 


(7 a. Une solution consiste à rechercher les N meilleures solutions pour chaque 
ville épelée.® 
b. Donc l'installateur fait des bidouilles avec les X paramétres qui en [soi] ne 
sont pas trés clairs ou pas forcément adaptés aux diverses situations des 


clients Të 


%*%He’s got many contacts, tons of numbers to fill his phone, but real mates, he’s got zero’ 
https://genius.com/Enz-narcisse-and-cassandre-lyrics 

10°Then we will only have our 0 euros of raise to ask for a credit? 
http://psasochaux.reference- syndicale.fr/files/2015/04/Tract-avril-15.pdf 

1*4 zero book/books’ 

2m voting for the 0 hours being paid as 35° 
https://fr.toluna.com/opinions/762230/Etes-vous-pour-ou-contre-les-35-heures 

P*But while even the other guys’ pals come here, 0 of mine have come to see me? 
https://twitter.com/MisHyding/status/762360289329307649 

H^ AI] this while, 0 persons have died of marijuana overdose’ 
https://anarchocommunismelibertaire.wordpress.com/ 

D'A solution would be to search for the N best possibilities for every city name’ 
http://www.afcp-parole.org/spip.php?article152 

16‘So the installer switches around the X parameters which are a bit obscure or not necessarily adapted to 
the various customer situations; 
https://www.bricozone.fr/t/reglage-chaudiere-viessman.11296/page-7 


faa 


faa 


faa 


E 


E 


E 
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c. Aujourd’hui, je constate que pour la éniéme fois, une voiture est garée devant 
mon entrée de garage, m'empéchant de sortir." 


These cardinals are obtained by converting letter names, usually French or Greek, to 
cardinals, making them the output of a morphological process and therefore fitting part 
of criterion (1e). 


2.5 Open-ended set 


In the general domain or in mathematical contexts this practice is limited to the con- 
version of a few letter names, but in computer programming names for integer-valued 
variables are created all the time and behave as simple cardinals , making CARD an open- 
ended category.’® Even the derived ordinals appear in computer program descriptions. 


(8) a. Lance le son à partir de la nbiéme [enbejem] seconde.” 


b. appFunc(NUM): Renvoie l'adresse de la NUMiéme fonction de la page 
courante 


The preceding discussion shows that French simple cardinals are part of an open- 
ended set with the productive coinage of integer variables. As we have seen above, ordi- 
nal derivation takes simple cardinals as input and letter name conversion gives simple 
cardinals as output. These three observations indicate that French simple cardinals fit 
the property (1e). 


2.6 Interim conclusion: the lexical status of simple cardinals 


In this section, we have shown that simple cardinals in French have all the proper- 
ties deemed characteristic of lexemes by Fradin (2003). Like typical lexemes, elements 
of CARD are created by borrowing and arbitrary coining while grammemes emerge 
through diachronic phenomena. Considering simple cardinals to be lexemes might seem 
at odds with the fact that we have taken them to be a sub-category of determiners, usu- 
ally not regarded as a lexeme-based category. In the following section, we argue that 
CARD, in general, are a part of the syntactic category of determiners but constitute a 
morphological category of their own. 


“Today, for the nth time, I see a car parked in front of my garage door, blocking my way’ 
https://goo.gl/lOrTuo 

Note that the French complex cardinals are not an open-ended set but rather a large set containing one 
trillion elements, as French speakers can count from 0 to 999,999,999,999. 

“Run the soundbite from the nbth second’ 
http://www.forum-dessine.fr/index.php?id=06038 

20*Returns the address of the NUMth function in the current page’ 
https://goo.gl/LHh46c 
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3 French cardinals: Category? 


In this section, we examine the status of French cardinals, simple and complex. We start 
with an overview of "Ihe Composition of Complex Cardinals’ (Ionin & Matushansky 
2006), as an example of a completely syntactic view of cardinal derivation. Then we 
argue that the phonological idiosyncrasies of complex cardinals are best modelled with 
a morpholexical system. 


3.1 Complex cardinals in syntax 


Ionin & Matushansky (2006: 316) argue that ‘complex cardinals are composed entirely 
in syntax and interpreted by the regular rules of semantic composition’. 


3.1.1 Semantics 


Their analysis describes the semantics of complex cardinals and their syntax in several 
languages, focusing particularly on Russian. To allow for the semantic combination of 
Cards in CardP, they propose that simplex cardinals have the type <<e,t>, <e,t>> so that 
a series of simplex cardinal followed by a noun predicate of type <e,t> will be able to 
combine step by step with a parent simplex cardinal as in (9) and result in a type «e.t». 


(9) 


<e,t> 


LEET ee 


<<e,t>,<e,t>> <e,t> 


two ec 


<<e,t>,<e,t>> <e,t> 


hundred books 


The actual semantic combination is not described in detail but the authors seem to rely 
on the packing strategy of Hurford (2007) where complex cardinals are analyzed based 
on the simple set of syntagmatic rules associated with calculations in (10). Figure 1 gives 
the corresponding structure for 5,002,600. 


DIGIT 
PHRASE (NUMBER) 


value(NUMBER) = value(PHRASE) + value(NUMBER) 


(10) = + NUMBER — 


e PHRASE —> (NUMBER) M 
value(PHRASE) = value(NUMBER) x value(M) 


Hurford describes the packing strategy as a constraint on the syntagmatic grammar 
in (10): 
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NUMBER 
"m udi 
PHRASE NUMBER 
AD "adn" 
NUMBER PHRASE NUMBER 
MP | 
DIGIT million NUMBER M PHRASE 
| a 
five DIGIT thousand NUMBER M 
| | | 
two DIGIT hundred 


six 


Figure 1: Syntagmatic analysis of 5,002,600 from Hurford (2007) 


- The sister constituent of a NUMBER must have the highest possible value.?! 


The semantic analysis proposed by Ionin & Matushansky (2006) does not warrant a 
syntactic view of complex cardinals. From an external perspective, it manages to treat 
complex cardinals and simple cardinals in the same manner, giving them the same se- 
mantic type and the same combinatorial constraints on the counted noun (atomicity and 
countability). 


3.1.2 Syntax 


Concerning syntax, Ionin & Matushansky (2006) describe two phenomena relevant to 
French cardinals: case assignment and number morphology. In Russian, cardinal-contain- 
ing NPs do not realize the direct cases (nominative & accusative) the same way as other 
NPs. For example, the NPs in (11) could all be used as subjects or direct objects. In (11a), 
Sag ‘step’ has the nominative/accusative plural form expected for a direct argument but 
in (11b) it has the genitive singular form (paucal in the terms of Ionin & Matushansky) 
and, in (11c), the genitive plural form. 


(11) a. Sag-i 
step-NOM.PL 
‘steps’ 
b. Getyre Sag-a 
four step-GEN.sG 
'four steps' 


"This constraint is intended to have the same effect as converting time in seconds into complex units such as 
days/hours/minutes/seconds, maximising the number of days first, then hours, minutes and finally seconds. 
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c. šesť Sag-ov 
six step-GEN.PL 
‘six steps’ 


The case and number appearing on the head noun depend on the last simple cardinal 
in CardP. Cardinal 1 does not interfere with direct cases, cardinals 2-4 assign genitive 
singular and the other cardinals assign genitive plural. 

This phenomenon also happens inside CardP in multiplicative contexts such as (12). 
Tysjaëa ‘1,000’ appears in the nominative singular alone, but in the genitive singular with 
4 and in the genitive plural with 5. 


(12) a. tysjac-a $ag-ov 

thousand-NoM.sG step-GEN.PL 
‘one thousand steps’ 

b. Getyre tysjac-i $ag-ov 
four thousand-GEN.sG step-GEN.PL 
‘four thousand steps’ 

c. pjat tysjac $ag-ov 
five thousand.GEN.PL step-GEN.PL 
‘five thousand steps’ 


The form variations above do not interfere with the external case and number. The case 
and number realized internally on the head noun and the multiplied cardinals in the 
CardP do not affect the case and number of the NP in its relation to the rest of the 
sentence. 

French does not have an inflectional case system similar to Russian but cardinals still 
display similar properties. In syntax, the CARD category identified for morphology in 
section 2.3 opposes the cardinals ending with elements million and milliard, infelicitous 
in (13a), with all other cardinals infelicitous in (13b).?? 


(13) a. Paula deux/cent/*un million euros à la banque. 
‘Paul has X euros in his account. 


b. Paula *deux/"cent/un million d'euros à la banque. 
"Paul has X of euros in his account? 


The data in (13) could be interpreted as a difference in category, un million being consid- 
ered as a noun rather than a CARD. But while the use of un million changes the shape 
of the NP, it does not affect its external relations to the sentence, just as in Russian. It 
appears that millions and milliard assign genitive plural to the head noun resulting in 


?"This could be contrived as million and milliard being classifiers but their behavior in complex numerals 
shows that they are indeed cardinal construction elements. 


(i) un milliard trois cents millions d'euros ‘1,300,000,000 euros’ 


(i) un, million une, pages, 1,000,001 pages’ 
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a de NP without changing the overall distribution of the cardinal-containing NP. Both 
structures participate in the contexts (5) used by Saulnier (2010) repeated below. 


(14) en dislocation: + — il en a deux/un million 

‘he has 2/1,000,000’ 

only alone before N: - —> mes deux livres/mes un million de livres 
‘my 2/1,000,000 books’ 

following the definite: + — les deux livres/les un million de livres 
‘the 2/1,000,000 books’ 

followed by de NP: + —> deux/un million de mes collègues 
‘2 /1,000,000 of my colleagues’ 


Including million, milliard and their combinations in the CARD category with differ- 
ent controlling features captures the external similarity while retaining the appropriate 
contrast between the different NP structures CARD N vs CARD de N in the examples 
above. 

French also displays number morphology inside complex cardinals , like Russian. The 
marks are visible in liaison contexts before triggers as shown in (15). 


(15) a. cent ans 
sat à 
‘one hundred years’ 
b. deux cents ans 
de sdz à 
‘two hundred years’ 


c. vingt ans 
vět à 
‘twenty years’ 

d. quatre -vingts ans 
Kata -véz à 
‘eighty years’ 


The @ forms of simple cardinals cent and vingt end in t but their final consonant is re- 
placed by z in multiplicative contexts.?? This change does not seem to be mandated by 
plural marking as cent and vingt are already plural controllers.?* 

All in all, Ionin & Matushansky (2006) and Hurford (2007) provide an interesting 
framework in which to analyze French cardinals as a unique syntactic category. The dif- 
ferentiated control properties and the idiosyncrasic number morphology they propose 


?5In liaison contexts, the t-final e forms alternate with the e forms depending on collocations. Frequent 
ones such as vingt ans ‘20 years’ and cent ans ‘100 years’ are generally pronounced with e forms (vétea, 
sáteà), but rarer collocations like vingt écureuils ‘20 squirrels’ and cent écureuils 100 squirrels’ are often 
found with the o forms (véeekyxoej, sàoekyxoej). But in any case, the emergence of a z-final e form outside 
multiplicative contexts is considered faulty: *“vézoekyxoej, "sázeekyxoej. 

#Hurford (2003: Section 3) describes a case in Finnish were number marking on cardinals makes a difference. 
Plural cardinals count groups of N while singular cardinals count N individuals. 
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allows for a uniform syntactic analysis where all complex cardinals are constructed in 
the same way. However, the phonological aspects of French cardinals do not go along 
with the perfectly predictable semantics and syntax of the complex cardinals on which 
Ionin & Matushansky (2006) build their syntactic view of the process. 


3.2 Complex cardinals and phonology 


From a phonological standpoint, idiosyncrasies are everywhere in the construction of 
French complex cardinals. In the following we review the various combinatorial excep- 
tions in the formation of complex cardinals and argue that it would be difficult to account 
for these with a purely syntactic analysis. 

As we have seen in section 2.1, French simple cardinals are subject to form variation 
according to liaison contexts. In the derivation of complex cardinals, however, simple 
cardinals use the same forms but in quite different distributions. For example, vingt ‘20’ 
and cent ‘100’ belong to the same type B in Table 1, p. 22: both combine with simple 
cardinals 2-9, but vingt uses the e form vét?? even though these cardinals are not liaison 
triggers, while cent uses the e form sá in the same context, as shown in Table 3. 


Table 3: vingt and cent combinations with simple cardinals from 2 to 9 


2 3 4 5 6 7 8 9 


20e  vét-do  vét-tewa vêt-kats vét-sék — vét-sis — v&t-set vét-yit — vét-noef 
1006  sá-de  sá-tewa ss sá-kate ^ sü-s&k sà-sis sõ-set sà-yit ^ sá-noef 


Combinations involving cing ‘5’ and huit ‘8’ in the construction of multiples of 100 
and 1000 are not parallel even though they belong to the same type D of simple cardinals 
in Table 1, with two alternating realisations for the e form: sé/sék , yi/yit. With cinq both 
of the e forms can be used in the combinations but with huit only the short e form wi is 
felicitous: 


(16) a. 500 s&-s@/sék-s@ , 5000 sé-mil/sék-mil 
b. 800 ui-sà/“uit-sà , 5000 yi-mil/*yit-mil 


Moreover, the same simple cardinal dix ‘10’ combines with 7-9 and with 1000, none of 
which are liaison triggers, but it uses the e form in the first case and the e in the second: 


(17) a. 17 diz-set , 18 diz-yit , 19 diz-noef 
b. 10000 di-mil 


Finally, instances of quatre-vingt have to be pronounced with an r at the end of quatre, 
even for speakers who usually drop it in word-final complex codas. 


Note that this holds true independently of the fact that the o form of 20 is subject to diatopic variation 
between vé and vét . 
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(18) a. un arbre frappé par la foudre 
cen agbs fsape pax la fudss = óen arb fsape pax la fud 


b. vingt-quatre francs 
v&tkatzo fsa = vétkat fsa 


c. quatre-vingts francs 
katuové fsa + *katvé fra? 


We conclude that even though both the semantic and syntactic dimensions of complex 
cardinal formation are simple and regular, the combinatory principles at work at the 
phonological level are far from simple and must be specific to cardinal formation, lead- 
ing us away from syntax and towards a lexical account of the derivation of complex 
cardinals. 


3.3 Complex cardinals in CARD 


As complex cardinals have the same distribution in the Saulnier-Leeman contexts in (5) 
and serve as input for the ordinal derivation, we analyze numerical cardinals as com- 
pounds created by means of a phrase structure grammar similar to those proposed by 
Hurford (1975, 1994, 2003, 2007). The analysis will be presented in two parts. We first 
introduce a model limited to the structure of 2-digit cardinals where most of the phono- 
logical and syntagmatic idiosyncrasies occur and then generalize it to the rest of the 
cardinals. 


3.3.1 2-digit cardinals 


Cardinal components are categorized according to their combinatorial properties (Table 
4). To demonstrate the mechanics of the analysis, we use arbitrary categories rather than 
motivated features to differentiate elements. The category names reflect their purpose 
in the system. Unit categories start with u for digits (u, ul, u4, u7) and uv (uv, uv1) for 
units under 20, while categories for multiples of ten begin with d (d, d1, d2, d6).2” 

The rules in Table 5 generate all 2-digit cardinals (category Digit2). Rule 1 states that 
simple cardinals are de facto Digit2. Rule 2 generates dix-sept, dix-huit, dix-neuf. Rules 
3 and 5 assemble et un and et onze. Rule 4 produces DixP for number between vingt 
‘20’ and cinquante-neuf ‘59’.28 Rule 6 makes the soixante compounds from soixante ‘60° 
to soixante-dix-neuf ‘79’ and rules 7 to 9 create the compounds based on quatre-vingt 
for number between quatre-vingts ‘80° and quatre-vingt-dix-neuf "00 77 Finally, rule 10 
elevates all intermediary compounds to Digit2. 


26katvé is correct, however, for the decimal number ‘4.20’. 

27To account for the Swiss and Belgian cardinal systems, the category d would have to include septante ‘70’, 
octante/huitante ‘80’ and nonante ‘90’. 

28Tn rule 4, the e form is selected for the first term: d2.e=vé+t 

2°Tn rule 7, the e form is selected for the first term, Dix8X.e-katxov£. In rule 9, The liaison consonant for d2 
changes to z, e becomes vé+z. 
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Table 4: Categories of cardinal components for 2-digit cardinals 


KA 
i) 


Digit2 — DixP 


complex cardinals 


Cat Components Example 
u deux (2), trois (3), cinq (5), six (6) vingt-deux 2042-22 
ul uni) vingt-et-un 20&1-21 
u4 quatre (4) quatre-vingts 4x20-80 
u7 sept (7), huit (8), neuf (9) dix-sept 104-7217 
uv douze (12), treize (13), quatorze (14) soixante-douze 60-12-72 
quinze (15), seize (16) 
ul onze (11) soixante-et-onze ` 60&11-71 
d trente (30), quarante (40), cinquante (50) trente-deux 3042-32 
di dix (10) *dix-deuxzdouze 1042712 
soixante-dix 60+10=70 
d2 vingt (20) quatre-vingts 4x20=80 
dé soixante (60) soixante-treize 60+13=73 
et et (&) trente-et-un 30&1=31 
Table 5: Syntagmatic rules for 2-digit cardinals 
Rule Comment 

1 Digit2 — u/ul/u4/u7/uv/uv1/d/d1/d2/d6 simplex cardinals 

2 DixIP — dle u7 diz (10) for 10+7..9 

3 Etl — etul ec (&1) for 20/30/40/50&1 

4 DixP — d/d2.e u/u4/u7/Etl 20/30/40/50+2..9/&1 

5 Etil — et ul/uv1 ece/e5z (&1/&11) for 60&1/11 

6 DixP — dé u/u4/u7/d1/Etil/uv/Dix1P 60+2..10/12..16/(10+7..9)/&(1/11) 

7 Dix8X — u4 d2z 4x20 for 80 

8  DixP — Dix8X complex 80 

9  DixP — Dix8X.e u/ul/u4/u7/d1/uv/uv1/Dix1P  80+1..16/(10+7..9) 
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The syntagmatic rules in Table 5 integrate constraints stipulating the combining forms: 


(19) a. Rules 2 and 4 use the linking form e of the first component; 
b. Rule 7 changes the liaison consonant of the second component from t to z; 
c. Rule 9 uses the e form of the first component. 


Figure 2 illustrates the application of rules 1-4, and more particularly the way diz-set 
and vét-set are obtained with d1. diz and d2.e vét . 


Digit? ` Dez ` Die  Digit2 Digit2 Digit2 Digit2 
ul I | e si H ae 
| | 
un i d onze Pas E gs 
s 


Figure 2: Phrase structures for 1, 7, 10, 11, 17, 20, 27 


Figure 3 shows how et onze ‘& 11’ and intermediary compounds such as dix-sept ‘17’ 
are combined with soixante ‘60’. 


Digit2 Digit2 Digit2 
EA al us 
e E s 
d6 di d6 Et11 d6 Dix1P 
a 
soixante dix "n IX soixante die  u7 
| | 
et onze dix sept 


Figure 3: Phrase structures for 70, 71, 77 


Finally, Figure 4 displays the combinations involving the quatre-vingt intermediary 
compound. When Dix8X is formed, the linking consonant of vingt is changed from t 
to z, but when the Dix8X is itself combined with another element by means of rule 9, 
its e form is selected rendering the previous change invisible. Thus we obtain the e 
form katwa-véz for quatre-vingts '80' and the forms katna-vé-set and katgo-v£&-diz-set for 
quatre-vingt-sept ‘87 and quatre-vingt-dix-sept ‘97’. 
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Digit2 


DixP 


Dix8X 


TN 


u4 d2.z 


quatre vingt 


Digit2 Digit2 
a a 
P bw 
t ou Dix8X.e Dix1P 
ae |. 4 £X 
| | | 
quatre P quatre H dix sept 


Figure 4: Phrase structures for 80, 87, 97 


Even though we provide rules for all Digit2 cardinals in Table 5, most of these com- 
pounds are probably lexicalized. The rules are like redundancy generalizations à la Lieber 
(1982) or Koenig (1999), stating observable regularities in existing lexemes. 


3.3.2 Numerical cardinals 


With most of the idiosyncrasies residing below 100, the fragment in Table 6°° for the 
composition of the higher combinations is simpler. It breaks the compounding into four 
levels corresponding to the counting units cent ‘100’, mille ‘1,000’, million ‘million’, and 
milliard ‘billion’. Each level is composed of two rules, one to multiply the unit level and 
one to add the units from the level below. 


Table 6: Syntagmatic rules for 3-digit+ cardinals 


Rule Comment 
11 CentX —> u/u4/u7.e Cent.z hundreds 
12 CentP —> CentX/Cent.e (Digit2) adding the Digit2 
13 Millen —> CentP/Digit2P.e Mille thousands 
14 MilleP — MilleX/Mille (CentP/Digit2) adding the hundreds 
15 MionX —> CentP/Digit2.e Mion.z millions 
16 MionP — MionX.e (MilleP) adding the thousands 
17 MiardX —> CentP/Digit2.e Miard.z billions 
18  MiardP —> MiardX.e (MionP) adding the millions 


For example, rule 11 assembles the multiples of cent ‘100’ and rule 12 adds the units 


3 We found no critical data for or against adding a linking z to the e form of multiplied million and milliard, 


rules 15 and 17. 
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from the level Digit2.?! In rules 12, 16 and 18, the selection of the o form?? happens only 
in the presence of the optional second term. 

Figure 5 shows how the two sets of rules combine in the analysis of numerical cardi- 
nals in general. 


CentP CentP 
m di cn 
L CentX.e Digit2 
+. a DixP 
Owe 
six cent six 1 Dix8X.e Dix1P 


AN F< 


u4 d2.z die ui 


quatre vingt dix sept 


Figure 5: Phrase structure for 600 and 697 


The analysis presented here relies on 26 combination elements, the 23 in (2) plus et, 
million and milliard. All numerical cardinals, including the simple ones, are derived from 
these elements. So, on the one hand, cardinal elements belong to special categories in 
the lexicon while, on the other hand, all numerical cardinals, including the simple ones, 
are CARDs derived from cardinal elements. 


4 French cardinals: Paradigm? 


In this section, we propose an analysis for a uniform paradigm of simple and complex 
cardinals. The analysis combines the observations about gender, liaison and compound- 
ing to (i) give a set of rules that fills the cells of the paradigm with the appropriate forms 
and (ii) associate each numerical cardinal with its proper syntactic frame. 

As lexemes belonging to the CARD category, French cardinals, simple and complex, 
undergo inflection with a paradigm based on two features: 


* LIAISON: 9, ©, 6 


* GENDER: M, F 


These two rules could be modified to generate the 11 to 19 multiples of cent (e.g. dix-huit cents 1,800"). The 
rest would also have to be adapted to avoid the generation of aberrations such as “un million dix-huit cents 
mille ‘2,800,000’. 

32T be more precise, rules 16 and 18 select the m.o form (i.e & for un). 
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This results in the six-cell paradigm exemplified in Table 7 with simple cardinals. 


Table 7: Uniform paradigm of cardinals 


T M F 2? M F 4 ^M F © M F 
oe & yn e de de o kats kats e si si 
o & yn o de de o kats kats © sis sis 
e dn yn e dez dez e kats kats ® siz siz 


The paradigm of complex cardinals follows the pattern of the rightmost element in 
the compound. For example, in Table 8, TRENTE-ET-UN, QUATRE-VINGT-UN and CENT-UN 
share the pattern of UN, and TRENTE-SIX, QUATRE-VINGT-SIX and CENT-SIX inflect like six. 

The only exception are vingt ‘20’ and cent ‘100’, which change their linking consonant 
from t to z in rules 7 and 11 (p. 32 & p. 34). 

Not only do the forms of complex cardinals depend on the element on the right edge, 
but their controlling properties are also derived from the right edge element. This distin- 


Table 8: Inflection on the Right Edge 


‘31 M F ‘81’ M F ‘101 M F 
e twatee  tsâteyn e  katrovéce  katsovëyn e sac  süyn 
o twatee  tsûteyn o  katrovéce  katsovëyn o sac  süyn 
e trüteden tsûteyn e katsovë&n  katrovéyn e sdden  sáyn 
zs «eps Con zs 
36 M F 86 M F 106 M F 
o  trütsi tgãtsi o  katrovési  katsovési o  Sáüsi süsi 
o  trütsis  tkütsis o  katrovésis katsovësis o  Süsis  süsis 
e  trütsiz  tEütsiz e  katrovésiz katwavésiz ®  süsiz  süsiz 


Table 9: Number morphology on the Right Edge 


20 M F 80 M F 


katrové katnavé 


o 
< 
M 
< 
ex 
o 


katrové katkové 


© 
< 
M 
< 
e 
© 


e vët vét e katsavéz katnavéz 
100 M F 200 M F 

o să sa e desa dosa 

o sa să o desa dosà 


® sát sat e dosäz dosäz 
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guishes cardinals ending in million/milliard from the others as seen below in (20) and in 
(13) (p. 28). 


(20) a. un milliard trois cents millions cinq cent mille chinois/*de chinois 
‘1,300,500,000 Chinese’ 


b. un milliard trois cent millions *chinois/de chinois 
‘1,300,000,000 Chinese’ 


Cardinals ending with million ‘million’ or milliard ‘billion’ impose a de-NP structure. 
We use a DE feature to encode this difference: DE = + for (20b), DE = - for (20a). 

Both the pz feature and the inflectional paradigm of compound cardinals can be con- 
structed using the Right Edge mechanism introduced by Tseng (2003) and Bonami et al. 
(2004) to model French phrasal affixes (à ‘at’, de of and liaison. The proposed mech- 
anism ensures that the properties of the rightmost element are propagated to the top 
of the construction by copying the relevant features of the last component to its parent 
node at every level of compounding represented by the arrows in Figure 6. Rules combin- 
ing two elements get a specific form from the left paradigm and prefix it to the paradigm 
on the right. 


MionP 
er 
MionX.e MilleP 
yo Hi 
CentP.e Mion.z MilleX CentP 
d X T aa yf NN 
CentX.e Digit2 million Digit2.e Mille CentX.e Digit2 
jeu DERE a T 
ue Centz u DixP mille ue Centz DixP 
NN ADI /N E Ok naL OR 
cinq cent six d.e u six cent d6 Dix1P 
KA MN a 
trente trois soixante dle u7 
n2 
dix sept 


Figure 6: Phrase structure for 506,033,677 


For example, on the right side, in (20a), the Dix1P prefixes the M.e form of pix diz to 
all forms of SEPT and carries the controlling property DE = - from szPr. In (20b), the 
combination selects the M.e form of six si and combines it with the modified paradigm 
of CENT where the linking consonant of the e forms t has been changed to z . 
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(21) a. ‘10° Se qe 
M F M F M F 
e di di e set set e  dizset  dizset 
o dis dis o set set > oe  dizset dizset 
e diz diz o6 set set e  dizset  dizset 
DE-- DE = - DE = — 
b "ei ‘100° 600° 
M F M F M F 
o si si o sã sã o sis sis 
oy ty ee RN 
o sis sis o sa sá o  Sisü  sisû 
® siz siz e sz sz ®  sisüz  sisdz 
DE = — DE = - DE-- 


The percolations proceed level by level, and yield a structure at the top with a full 
paradigm and the appropriate value of the DE feature.?? 

The model outlined here relies on the propagation of ready-made elementary para- 
digms via a phrase structure grammar rather than rules of exponence or referral based on 
the inflectional features of the different cardinals as is common with Word and Paradigm 
syntagmatic frameworks? such as A-Morphous Morphology (Anderson 1992), Paradigm 
Function Morphology (Stump 2001) or the Information-Based Model of Bonami & Crys- 
mann (2013). It is more in line with paradigm-oriented models like Network Morphology 
(Corbett & Fraser 1993). 


5 Conclusion 


In this chapter, we set out to discuss the place of cardinals in French morphology with a 
focus on their status as lexemes, their categories and their inflectional paradigms. Taking 
into account the number of phonological idiosyncrasies in the formation of French car- 
dinals, we argued that they should be considered as lexemes. Following Saulnier (2008, 
2010), Fradin & Saulnier (2009), we examined both their morphotactic properties and 
their syntactic distribution and concluded that they belonged to a morphosyntactic cate- 
gory CARD inside the determiners. We showed that there are two types of cardinals 
regarding the way they associate with nouns, the direct type like CINQUANTE-DEUX 
(cinquante-deux années) and the indirect type like UN-MILLIARD-TROIS-CENTS-MILLIONS 
(un milliard trois cents millions d'années). This distinction making no difference on the 


5. ‘506,033,677 
M F 
e sčksăsimiljõtrăttgywamilsisăswasãtdizset sčksăsimiljõtrăttswamilsisăswasătdizset 
o séksäsimiljiträttswamilsisäswasätdizset sčksăsimiljõtrăttswamilsisăswasătdizset 
e séksdsimiljStrattswamilsisaswasatdizset sčksăsimiljõtrăttswamilsisăswasătdizset 


DE =- 
34See Boyé & Schalchli (2016) for a typology of views on inflectional paradigms in different theories. 
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outside of the NP, we analyzed them as compounds based on 26 simple elements? using 
a phrase structure grammar, even though the cardinals below 100 are probably lexical- 
ized. Our compounding mechanism propagates the inflectional and syntactic properties 
ofthe rightmost component to the entire compound to create its paradigm and percolate 
its type (DE = +). 

The type of compounds we advocate for is different from the usual two-component 
ones. It expands the ternary compounds described in the biomedical domain by Namer 
(2005) to higher levels of composition. The extended compounding mechanism allows 
to generate all numerical cardinals as CARD without having to cast them into the differ- 
ent subcategories that would be needed to break the compounding process into binary 
operations. It does not presuppose that complex cardinals are lexicalised but only that 
they can be created online by morphology, as [+morphological, -lexical] compounds in 
the sense of Gaeta & Ricca (2009). 

The model outlined here should be integrated with the formal analysis of Bonami et al. 
(2004) of liaison in HPSG (Pollard & Sag 1994). It would be interesting to examine data 
from the cardinals in other languages to parallel the work of Stump (2010) on the ordi- 
nals*° and from the composition of the decimals and its interference with the integers.°” 


5.1 Remaining questions 


Cardinal coordinations do not respect lexical integrity. Examples like (21a) are common, 
and even stranger coordinations appear with ordinals where the first ordinal is realised 
as a gender-agreeing cardinal as in (21b). 


(21 a. Quelques soixante-dix ou quatre-vingt mille personnages sont passés à la 
trappe, 35 000 sont à l'ombre.?? 
b. Ses débuts, il les fit, dans sa ville natale, au début du siécle dernier dans sa 
vingt-et-une ou vingt-deuxiéme année.?? 


Saulnier (2008) observes that quelques follows the syntactic distribution of CARDs 
and derives the quelquiéme ordinal found in trente et quelquiéme ‘thirty-somethingth’.*° 


35Nothing would prevent French from using more elements. In fact, it has been proposed since the 15th 
century to expand the counting system by including billion, trillion, quadrillion, etc. (see Saulnier 2010: 
147-151 for an overview of the proposals). 

3°French ordinals are derived from their cardinal counterparts by -ième suffixation as proposed by Stump 
(2010: p. 228) with the notable exceptions of millioniéme and milliardième which drop the un from un 
million and un milliard. 

?' Many ill-formed cardinals are in fact well-formed decimals. For example, cinq vingt is automatically un- 
derstood as ‘5.20’. Furthermore, un million un ‘1,000,001’, when not followed by a counted noun, is usually 
perceived as ‘1,100,000’ with million interpreted as a measure unit. 

38‘Some seventy or eighty thousand persons have disappeared, 35,000 are in jail’ 
http://plumenclume.org/blog/173- erdogan-consolide-son-emprise-par-israel-adam-shamir 

3°His debut, he made at the beginning of last century, when he was in his twenty-first or twenty-second 
year. 
http://www.ww w.dutempsdescerisesauxfeuillesmortes.net/fiches bio/darbon/darbon.htm 

"'Fradin & Saulnier (2009) also mention combien/combientiéme ‘how many’, quel/quelliéme ‘which’ as poten- 
tial cardinal/ordinal pairs (quantiéme/tantiéme look more like fractions than ordinals). 
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The arguments developed in this chapter for a morphological analysis of the compo- 
sition of cardinals rely on the idiosyncrasies of complex cardinals below 100. To capture 
the phenomenon in (21), it would be possible to propose a morphological analysis of 
lower complex cardinals as compounds and lexemes, while still allowing syntactic com- 
position for higher complex cardinals. 
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Chapter 3 


Word formation and word history: 
The case of CAPITALIST and CAPITALISM 


Franz Rainer 
WU Vienna 


The treatment of the history of modern vocabulary in historical and etymological dictionar- 
ies is generally disappointing, especially with respect to the processes by which the words 
came into being. The TLFi' only provides the following information concerning the his- 
tory of French capitalisme and capitaliste: “Capitalisme [...] Dér. de capital’*; suff. -isme"", 
“Capitaliste [...] Dér. de capital"; suff. -iste*”. Such a treatment, which is inadequate even 
from a synchronic point of view (in the sense ‘a supporter of capitalism’, capitaliste is derived 
from capitalisme by affix substitution), does not do justice to the manifold relationships that 
have developed between these two words and their common base capital in the course of 
the 300 years since the creation of Dutch Capitalist in 1621. The present paper retraces in 
detail the many steps of the unfolding of these two words in French. It is shown that each of 
their many senses constitutes a separate lexeme and must be provided with an etymology 
of its own. Particular attention is dedicated to the identification of the exact mechanism 
(borrowing, semantic extension, word formation) that was at work at each step. 


1 In the beginning was the lexeme 


Right from the beginning of the study of the internal structure of complex words, schol- 
ars have been divided between those who tried to put complex words together from 
smaller pieces in a bottom-up fashion (the Paninian tradition) and those who tried to ac- 
count for the internal structure by mapping words onto other words (the Greco-Roman 
tradition, based on analogy). This fundamental divide is still with us, in the form of an 
opposition between what we now call “morpheme-based” and “word-based” (or "lexeme- 
based") approaches to morphology (see Aronoff 2007). In the French linguistic landscape, 
the morpheme-based approach held some sway before the turn of the millennium due 
to having been embraced by Danielle Corbin (see Corbin 1987), who played an impor- 
tant role in the renewal of the study of word formation in France. But more recently 


1Trésor de la langue française informatisé, available at http://atilf.atilf.fr/. 
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most French morphologists seem to be quite unanimous in preferring the lexeme-based 
approach, not least due to the forceful argumentation in its favour in Fradin (2003). 

In my contribution, I would like to pour more water on the lexeme-based mill by look- 
ing in some detail into the history of the two words CAPITALIST and CAPITALISM, in which 
semantic change, calques and word formation - suffixation, conversion, but also suffix 
substitution, a notorious conundrum for morpheme-based approaches - have interacted 
in a complex manner. It will become apparent that these changes find a natural explana- 
tion within a lexeme-based framework, while they seem to be difficult to accommodate 
without contortions in a morpheme-based one. However, the chapter is meant to be of 
interest not only to morphologists or lexicologists, who constitute the main intended 
readership. Both words treated are key concepts of present-day intellectual vocabulary 
and as such have attracted considerable attention from scholars from other disciplines, 
mostly historians such as Fernand Braudel, Lucien Febvre, Henri Hauser or Edmond Sil- 
berner in France, or Richard Passow, Marie-Elisabeth Hilger and Annette Höfer in Ger- 
many. For such readers, the linguistic arguments of this contribution may sometimes 
seem to be a little far-fetched, while they would probably here and there like to receive 
more abundant encyclopedic information. This latter type of information, however, must 
be kept to a minimum here, providing just what is necessary for underpinning the lin- 
guistic argumentation. Even so, non-linguists will hopefully appreciate the new facets 
of the history of these two words, which I was able to add to the existing documentation 
due to the abundance of new material that we can now dip into thanks to Google Books 
and Gallica.? 

In order to avoid misunderstandings, one formal proviso is in order before we start 
our investigation. It is established practice in linguistics to write lexemes in small caps. 
In this tradition, the English lexeme CAPITALIST would represent the set of English word 
forms { capitalist, capitalists }. I will not follow this usage here, but use small caps in- 
stead whenever referring to a word independently from its exact formal realization in 
individual European languages. Throughout this text, CAPITALIST therefore represents 
the set {English capitalist, German Kapitalist, French capitaliste, etc.}, and similarly for 
other words in small caps. 


2 CAPITAL, CAPITALIST and CAPITALISM in synchrony and 
diachrony 


For present-day speakers of European languages, CAPITALISM refers to a specific kind 
of economic system and is undoubtedly felt to be based somehow on CAPITAL, though 
many speakers will be hard-pressed to specify the exact semantic relationship between 
base and derivative or will construe it in different ways. This indeterminacy is mainly 


?On the history of CAPITALISTE, see Rainer (1998). A short, updated entry on the history of French capitaliste, 
written together with Jean-Paul Chauveau, can be found on TLF-Étym, an etymological online dictionary 
that can be consulted at http://www.atilf.fr/tlfetym/. The corresponding entry on French capitalisme can 
be consulted on the same site. 
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due to the fact that the word cAPrrAr itself has various senses, not all of them equally 
familiar to non-economists, and that it is not obvious which sense is the relevant one for 
the construal of the meaning of cAPrTALISM. The Free Dictionary,’ for example, manages 
to define CAPITALISM without recourse to CAPITAL: “An economic system in which the 
means of production and distribution are privately or corporately owned and develop- 
ment occurs through the accumulation and reinvestment of profits gained in a free mar- 
ket.” CAPITALIST, on the contrary, will most often be spontaneously analyzed as based on 
CAPITALISM, referring to a supporter of the particular kind of economic system denoted 
by this word. ‘A supporter of capitalism’, in fact, is the first sense in the online dictio- 
nary quoted above, which adds two more senses that seem to be less prominent today: 
2. 'An investor of capital in business, especially one having a major financial interest in 
an important enterprise’; 3. ‘A person of great wealth’. The foregoing remarks seem to 
be valid for European languages in general. In other respects, however, individual lan- 
guages differ, for example, with respect to whether they tolerate the adjectival usage of 
CAPITALIST, possible in French and English, but not in German. The connotations of the 
members of this word family will also differ, depending on the stance that a speaker or 
speech community takes with respect to the economic system called CAPITALISM. 

The etymological treatment of CAPITALISM and CAPITALIST in historical dictionaries 
seems to have been inspired by and large by such intuitions about the synchronic rela- 
tionship between CAPITAL, CAPITALIST and CAPITALISM. The TLFi, for example, writes: 


Capitalisme subst. masc. [...] Dér. de capital”; suff. -isme". 


Capitaliste adj. et subst. [...] Dér. de capital"; suff. -iste*. L'hyp. d'un empr. au néerl. 
kapitalist (BL.-W.°) ne semble pas justifiée. Le corresp. all. Kapitalist « possesseur 
d’un capital » est attesté dep. 1694 (WEIGAND).4 


As we will see, this kind of analysis in no way does justice to the complex interre- 
lationships that have developed over time among the three words of this word family, 
nor to the inter-European relationships that link corresponding members in different 
European languages. I will now describe these relationships by following the evolutions 
of the individual words step by step from the 17" century up to the present time. 


7th 


3 The evolution of the noun CAPITALIST from the 17" to 


the 19 century 


3.1 CAPITAL 


This is not the place to take up the complex history of cAPrrAr at full length. Suffice 
it to say that by the time that the first derivative, CAPITALIST, appeared, CAPITAL gener- 


3http://www.thefreedictionary.com/capitalism. 

[Capitalisme masc. noun [...] Derived from capital"; suffix -isme*. / Capitaliste adj. and noun [...] Derived 
from capital"; suffix -iste*. The hypothesis that it be a loan from Dutch kapitalist (BL.-W.^) does not seem 
to be justified. The corresponding German word Kapitalist ‘owner of capital’ has been attested since 1694 
(WEIGAND).] 
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ally referred to the property, not necessarily only money, that a rich person owned. In 
double-entry bookkeeping, the term referred to the net worth owned by the merchant 
after taking away the liabilities from the assets. Towards the end of the 18' century 
economists extended the meaning of the term to include the means of production (build- 
ings, machines, tools) used in agriculture or industry, what is now called PHYSICAL CAPI- 
TAL. This more technical sense still has not really penetrated into common language, but 
it did play a role in the history of CAPITALIST and CAPITALISM, as we will see. More recent 
extensions of the concept, by contrast, such as HUMAN CAPITAL Or SOCIAL CAPITAL, had 
no influence. 


3.2 CAPITALIST: the Dutch origins 


As we saw in Section 2, the TLFi rejected the hypothesis of a Dutch origin of the French 
noun capitaliste, which had first been put forward by Barbier (1944-1952: nr. XXV). This 
decision was ill-advised, since the noun Capitalist was indeed coined in the Netherlands 
(then: “United Provinces") back in 1621 by tax authorities in order to designate a wealthy 
citizen who possessed 2,000 guilders or more: 


Special registers distinguished the taxpayers into two categories: those owning 
more than 2,000 guilders were called 'capitalists' (from 1621), and those owning 
1,000 to 2,000 guilders were the so-called ‘half capitalists’ (from 1625). People own- 
ing less than 1,000 guilders were fully exempt from extraordinary property taxes. 
A proposal from 1641 to introduce another level, from 20,000 or 30,000 upwards, 
was not accepted. The word 'capitalist', here used in its earliest meaning, clearly 
designated someone owning property. ('t Hart 1993: 122-123) 


Dutch Capitalist was derived from Capitaal ‘capital’ and followed the pattern of forma- 
tions in -1sT that designated persons engaged in some activity, not the supporter pattern, 
both of which were already well established at that time (see Wolf 1972). In order to un- 
derstand the choice of suffix, we probably have to assume that the coiner conceived of 
a Capitalist as a money-lender or investor, not as a passive possessor of a huge sum of 
money or property. Dutch Capitalist was a complex concept, designating at the same 
time a wealthy person, mostly engaged in money-lending or investment activities, as 
well as a category of the tax authorities. Since both these facets were linked by mutual 
inference, we should view them as part of one and the same concept, not as two indepen- 
dent concepts, very much like book can designate at the same time the object on the table 
and its content. It is also highly probable that the precise original definition of Capitalist 
on the part of the tax authorities (‘a person worth 2,000 guilders or more’) was relaxed 
in common parlance to refer simply to very rich individuals in general. 

The 17" century is called the “Golden Age” in Dutch historiography, because the 
United Provinces at that time were at the forefront of trade, military, science and art. This 
background, especially their eminent position in international finance, explains how a 
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Dutch neologism could spread abroad and start an astounding international career. Al- 
ready by the end of the 17'" century, we find loan translations in German and French. 
German Capitalist (today written Kapitalist) appears as early as 1671 in a document on 
the financial system of the United Provinces, where, due to its novelty, it is glossed as 
‘money-lender’ (Rainer 1998: 10). The German word, as far as I can see, had no influence 
on French, which will be the focus of the rest of this paper. 


3.3 French capitaliste: its semantic evolution until the Physiocrats 


There can be no doubt about the Dutch origin of the French noun capitaliste. The oldest 
example, in fact, comes from the Mercure Hollandois of 1678, p. 13 and clearly refers to 
the very special fiscal meaning which the term had at that time in the United Provinces: 
“Pour cet effet [i.e. to put up an army of 100,000 men in a fortnight] ils posoient qu'il y 
avoit dans la Province de Hollande 65 500 Capitalistes, qui étoient taxés sur les Cahiers 
de l'Etat à 2.4.6.10.20. & 80 000 livres”? The few examples that we find in French until 
the middle of the 18'^ century (quoted under ILA in the corresponding TLF-Etym entry) 
refer to that same Dutch reality. In the second half of the 1gth century, however, the 
noun capitaliste firmly established itself in French with a reference independent from the 
Dutch context. Here is a quote from the Dictionnaire domestique portatif (Paris: Vincent 
1765), vol. 3, p. 505: “RENTIERS; ce terme est synonyme à capitaliste, c'est-à-dire, à celui 
qui fait valoir son argent, en le disposant suivant le cours de la place, & qui vit de ses 
rentes?" 

The diffusion of the term among a wider public was furthered by its adoption by the 
Physiocrats, an economic school that began holding much sway at that time, in France 
and abroad. The following example from Turgot's Réflexions sur la formation et la distri- 
bution des richesses illustrates the meaning that will be the dominant one throughout the 
fort century: 


§ XCIII 
Le capitaliste préteur d'argent appartient, quant à sa personne, à la classe disponible. 


Nous avons vu que tout homme riche est nécessairement possesseur ou d'un cap- 
ital en richesses mobilieres, ou d'un fonds équivalent à un capital. Tout fonds de 
terre équivaut à un capital ; ainsi tout propriétaire est capitaliste, mais tout capital- 
iste n'est pas propriétaire d'un bien fonds ; et le propriétaire d'un capital mobilier 
a le choix, ou de l'employer à acquérir des fonds, ou de les faire valoir dans des 
entreprises de la classe cultivatrice ou de la classe industrieuse. Le capitaliste, de- 
venu entrepreneur de culture ou d'industrie, n'est pas plus disponible, ni lui ni ses 


5[To that effect they assumed that there were in the province of Holland 65,500 capitalists, whose tax charge 
according to the state's tax lists was 2, 4, 6, 10, 20 or 80 thousand pounds.] 

6[RENTIERS ; this term means the same as capitalist, that is, one who invests his money according to the 
evolution of rates on the market and lives off his private income.] 
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profits, que le simple ouvrier de ces deux classes ; tous deux sont affectés a la con- 
tinuation de leurs entreprises. (Turgot, Réflexions sur la formation et la distribution 
des richesses, s.l. 1788, p. 125) 


As one can see, the term is now completely detached from its original fiscal context 
and simply refers to wealthy individuals who try to increase their capital by either lend- 
ing money at interest or investing it in productive enterprises (directly, or on the stock 
market). The meaning, therefore, roughly corresponded to both senses 2 and 3 of the Free 
Dictionary quoted in Section 2. It was not really a French innovation: already the Dutch 
capitalists typically engaged in precisely these two activities. What is new is that the 
word could now be used without reference to the particular Dutch context and that the 
fiscal perspective to which the Dutch term was originally tied had sunk into oblivion. 
By the same token, the original concept was simplified, being stripped of its fiscal facet. 


3.4 CAPITALIST spilling over to the Anglo-Saxon world 


Nowadays we strongly associate capitalism with the Anglo-Saxon world, but the truth is 
that Great Britain and the United States were the last among the big, developed nations 
to take up the word CAPITALIST. In English, capitalist does not make its appearance be- 
fore 1787, when the following example is attested in Madison’s writings (The Writings of 
James Madison, ed. G. Hunt. New York/London: Putnam’s Sons 1903, vol. 4, p. 123):8 “In 
other Countries this dependence results in some from the relations between Landlords 
and Tenants in others both from that source and from the relations between wealthy cap- 
italists and indigent labourers.” Four years later, the word is used in England by Edmund 
Burke: 


On the policy of that transfer I shall trouble you with a few thoughts. In every 
prosperous community something more is produced than goes to the immediate 
support of the producer. This surplus forms the income of the landed capitalist. It 
will be spent by a proprietor who does not labour. (Edmund Burke, The Political 
Magazine 21, 1791, p. 75) 


Up to that moment, capitalists were generally referred to as monied men in English, 
an expression that rapidly succumbed to the prestige of the newcomer, but not before 
giving rise, for a short period of time, to the blend monied capitalists. There can be no 
doubt that French was the donor language for the English calque. 


7[$ XCIII / The money-lending capitalist is part of the available class / We have seen that any monied man 
necessarily owns either capital constituted of transferable riches or a property equivalent to capital. Landed 
properties are always equivalent to capital; therefore all landowners are capitalists, but not all capitalists 
own property; and the owner of transferable capital can choose to use it to buy property or to invest it in 
enterprises of the agricultural or industrial class. The capitalist who has become entrepreneur in agriculture 
or industry is no more available, neither he himself nor his profits, than the simple worker of these two 
classes; both are engaged in the continuation of their enterprises.] 

5The first attestation given in the OED is from 1792. 
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3.5 The capitalist as entrepreneur 


From the 17! century to the 19th century, the dominant meaning of CAPITALIST in all Eu- 
ropean languages was that of a wealthy person who made his capital “work” by lending 
it at interest, buying bonds or shares, or investing it in productive activities. In this last 
case, a capitalist could easily become an entrepreneur himself, directly engaged in the 
management of the firm he owned or of which he was an associate. By shifting the atten- 
tion from the ‘monied man’ sense to this latter facet of the complex concept ‘capitalist’, 
the word eventually also became established in the new sense of ‘entrepreneur’, defined 
in the Free Dictionary as ‘a person who organizes, operates, and assumes the risk for a 
business venture’. As already observed by Passow (1927: 109-111), this shift in meaning 
occurred first in English: 


When the manufacturing capitalist of Europe shall advert to the many important 
advantages, which have been intimated, in the course of this report, he cannot 
but perceive very powerful inducements to a transfer of himself and his capital to 
the United States. (The American Museum, Philadelphia: Carey 1792, Part I, from 
January to June, Appendix II, p. 19) 


All the laws connected with our manufacturing system, appear to be founded on 
one erroneous principle, that the capitalists or masters are the only part to be pro- 
tected against combination and injustice, though the artizans or workmen have an 
equal right to be protected in their property or skill [...]. (The Parliamentary Debates 
from the Year 1803 to the Present Time. Vol. 23. London: Longman 1812. July 21, 1812 
- column 1165) 


The small farmer has disappeared, and the smaller manufacturers are superseded 
by large capitalists, who alone can afford to purchase expensive machinery. (Re- 
marks on the Practicability of Mr. Robert Owen's Plan to Improve the Condition of the 
Lower Classes. London: Leigh 1819, p. 6) 


The new sense may have arisen in English at that time due to the lack of specific word 
for ‘entrepreneur’ (entrepreneur in the relevant sense dates from the mid-19" century). 
What is more surprising is that this English usage should be taken over by French, where 
the word entrepreneur, which English was to borrow a few decades later, was already well 
established. One precocious example which, at least at first sight, seems relevant in our 
context is the following from Charles Caseaux’ Considérations sur les effets de l'impót 
dans les différens modes de taxation: 


[...] on doit toujours distinguer avec le méme soin deux espéces de capitalistes 
ou propriétaires ; j'appelle les uns capitalistes de la terre, et les autres capitalistes 
de l'industrie : —les capitalistes de la terre ou territoriaux, sont non-seulement les 
propriétaires du grand capital de la terre mais ceux de toutes les espéces de capitaux 


?Note that Caseaux lived in London at that time. 
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nécessaires pour tirer du grand capital, tout le produit dont il est susceptible : — 
les capitalistes industriels, ou de l'industrie, sont les différens propriétaires non- 
seulement du capital en argent qui met journellement le travailleur en action dans 
l'industrie comme il le met sur la terre, mais de tous ces autres capitaux appelés 
bátimens, ustensiles, machines, crédit méme etc. (Charles Caseaux, Considérations 
sur les effets de l'impót dans les différens modes de taxation, London: Spilsbury 1794, 
p. 98)! 


This use of capitaliste by Caseaux straightforwardly ties in with his Physiocratic back- 
ground: the capitalist, for him, is not simply a money-lender but the person who pro- 
vides capital in the broad sense of the word, that is, including both fixed (land, buildings, 
machinery, tools) and circulating (intermediate goods, operating expenses) capital. Jean 
Baptiste Say, in the fourth edition of his Traité d'économie politique, is well aware of 
the potential dangers of the polysemy of the term CAPITALIST and therefore carefully 
demarcates the concept ‘capitalist’ from that of ‘entrepreneur’: 


CAPITALISTE ; est celui qui posséde un capital et qui le fait valoir par lui-méme, 
ou bien le prête, moyennant un intérêt, à l entrepreneur d'industrie qui le fait val- 
oir, et dés lors en consomme le service et en retire les profits. [...] Un entrepreneur 
d'industrie agricole est cultivateur lorsque la terre lui appartient ; fermier lorsqu'il 
la loue. Un entrepreneur d’industrie manufacturière est un manufacturier. Un en- 
trepreneur d'industrie commerciale est un négociant. Ils ne sont capitalistes que 
lorsque le capital, ou une portion du capital dont ils se servent, leur appartient ; ils 
sont alors à la fois capitalistes et entrepreneurs. (Jean Baptiste Say, Traité d'économie 
politique, 4th edition, Paris: Deterville 1819, vol. 2, pp. 456, 469)!! 


Despite Say’s efforts at clarifying the meaning of CAPITALIST, some of his French com- 
patriots yielded to the new English semantics, using capitaliste in lieu of entrepreneur or 
patron ‘master’ and opposing it with ouvrier or travailleur ‘worker’. The English usage 
may have crept into the French language through translations such as the following: 


Nouveau systéme d'association entre les petits capitalistes et les ouvriers, proposé 
par l'auteur (Babbage, Charles Traité sur l'économie des machines et des manufac- 
tures. Traduit de l'anglais par Éd. Biot. Paris : Bachelier 1833, p. xiv) 


wi 
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one always has to distinguish carefully two types of capitalists or owners; I call the first one landed cap- 


italists, and the other manufacturing capitalists: —the landed capitalists are not only the owners of the 
important capital of the land but also of all kinds of capital necessary for deriving from the land all the 
produce it can yield: —the manufacturing capitalists are owners not only of the money that makes work- 
ers become active in the factory as it does on the land, but of all the other capitals called buildings, tools, 


machines, even loans, etc.] 
11 [ 


CAPITALIST: one who possesses capital and puts it to use himself or lends it to an entrepreneur on interest 


who then consumes its service and reaps the profit made. [...] An entrepreneur in agriculture is called a 
farmer if he owns the land, a tenant if he rents it. An entrepreneur in industry is called a manufacturer. An 
entrepreneur in trade is a merchant. They are only capitalists if they own the capital, or part of the capital 
they use; in that case, they are at the same time capitalists and entrepreneurs.] 


3 Word formation and word history: CAPITALIST and CAPITALISM 


Les marchandises étant le produit du capital et du travail, sont la propriété com- 
mune du capitaliste et du travailleur (ici ouvrier). (Contes de Miss Harriet Martineau 
sur l'économie politique. Traduits de l'anglais par B. Maurice. La Haye : Vervloet 
1834, p. 179) 


From the mid-1830s onwards, this new usage also became quite frequent in texts writ- 
ten by French authors and was to establish itself alongside the more restrictive traditional 
use (respectively senses ILA and ILB in the TLFi): 


Maintenant cherchons la loi qui détermine le taux des profits. Cette loi devra avoir 
un rapport intime avec celles des salaires, car le capitaliste et le travailleur se parta- 
gent le méme produit. (Tournal général de l'Instruction publique, nouvelle série, vol. 
7, nr. 95 (1838), p. 1005 [Pellegrino Rossi]) 


Il s'agissait de la grande question de la lutte établie entre le capitaliste et le salarié, 
entre l'entrepreneur et l'ouvrier, de la question du paupérisme enfin. (Mélanges 
Religieux, vol. 1, nr. 21, 11 juin 1841, p. 331) 


Malheureusement la question du salaire se compliqua de celle de la jouissance de la 
case et du terrain en dépendant, et, ainsi enchevétrées, elles donnérent lieu aux plus 
grandes difficultés entre le capitaliste et le travailleur. (Milliroux, Félix Demerary, 
transition de l'esclavage à la liberté. Paris: Fournier 1843, p. 31) 


It is easy to see that the rise of the ‘entrepreneur’ sense of CAPITALIST goes hand in 
hand with the progress made by the Industrial Revolution, where France followed Eng- 
land with a certain time lag. It was the Industrial Revolution that provided capitalists 
with new opportunities to put their wealth to use by engaging in industrial activities, 
instead of lending money on interest or speculating on sovereign debt or the shares 
of trading companies. This new opposition between capitalists and workers will be of 
crucial importance for the further fate of our word family from the mid-19" century 
onwards. 

From a linguistic point of view, this second semantic change of CAPITALIST is another 
example of a shift of emphasis that took place within a complex concept, mirroring 
changes that had previously occurred in the extra-linguistic world. Examples such as 
these make it clear that what we call “semantic” change in historical linguistics cannot 
be described on the basis of a minimalist semantics as conceived by the structuralists 
and other semanticists, but needs to take into account concepts in all their encyclopedic 
richness. It should also be mentioned here that the rise of the ‘entrepreneur’ sense led to 
a decrease in transparency of CAPITALIST, since the new technical sense of CAPITAL on 
which it was based, introduced by the Physiocrats and focusing on land, buildings, ma- 
chinery, raw materials and intermediate goods more than solely money, had not become 
familiar to the speech community at large. The relationship between base and derivative, 
which had been quite transparent in the ‘monied man’ sense, thereby became somewhat 
obscured for ordinary speakers. 
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4 CAPITALISM 


Throughout the 17" century and most of the 18" century, the noun CAPITALIST was an 
“only child”, pertaining to a word family with only two members, CAPITAL and CAPITAL- 
IST. At the beginning of the 19th century, however, this nuclear family started expanding 
in several directions. With CAPITALISM, a little brother was born, and CAPITALIST itself 
brought into the world an adjectival progeny, as we will see in Section 5. At the same 
time, complex incestuous relations developed between CAPITALISM and CAPITALIST, both 
in their nominal and adjectival uses. In this section, we will follow the development of 
CAPITALISM from its obscure beginnings to its establishment as one of the key notions of 
modern economic and political discourse in the mid-19^ century. 


4.1 CAPITALISM 'condition of being rich' (1753): a ghost word? 


Dauzat (1972) claimed that French capitalisme was used as early as 1753 in the Ency- 
clopédie with the meaning ‘état de celui qui est riche’.!? He was followed on this point 
by the TLFi, while Braudel's search for the text alluded to by Dauzat yielded no result: 
"Le texte invoqué reste introuvable? (1979, vol. 2, p. 205). I could not find it either in 
the electronic version of the Encyclopédie that we have at our disposal nowadays.!* It 
is difficult to imagine that Dauzat should have invented his early first attestation, but 
something must have gone wrong. In fact, neither can the French word be found with 
Google Books in the entire second half of the 18" century. 

However, this latter source provides one isolated early attestation of German Kapi- 
talismus, a clearly jocular occasionalism from Itzehoe's Komische Romane (Góttingen: 
Dieterich 1787, vol. 4, p. 304), in a text full of somewhat contrived neologisms. It seems 
to express very much the same sense as the one indicated by Dauzat for French: “Der 
Redakteur dieser Papiere, der, wie aus allen seinen Schreibereyen hervorgeht, sich voll 
tiefer Ehrerbietung gegen jegliches Menschengesicht fühlt, das nur halbwege mit dem 
Stempel der Vornehmigkeit und des Kapitalismus gemarket ist, sieht sich hier in großer 
Verlegenheit?” Since there are no other examples for German either until around 1840, 
it is best to leave this potential proto-use of CAPITALISM as a riddle for future research 
and turn to its first appearance in the 19" century. 


4.2 CAPITALISME ‘high finance’ (ca. 1810) 


At the time of the French Revolution, the noun capitaliste had acquired distinctly nega- 
tive overtones, referring to individuals who had enriched themselves in the political and 
economic turmoil of those years, to the detriment of the general good (see Hófer 1986). 
We should keep this background in mind in order to understand the following passage, 


P 
Bp 


condition of being rich] 

The text alluded to is nowhere to be found] 

“Cf. http://portail.atilf.fr/encyclopedie/. 

S[The editor of these papers who, as can be seen in all his writings, feels deference for any human face that 
somehow expresses high rank and capitalism, faces great embarrassment here.] 
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written at the moment when Napoleon had reached the climax of his power (around 
1810) and drawn from a letter addressed to a statesman by an “agent observateur” whose 
name is not disclosed: 


Mais qui [sic] dire de cette puissance nouvelle du Capitalisme, qui née du com- 
merce qu'elle ruine, a succédé avec toute son immoralité, à la puissance si morale 
de la fructification du sol qu'elle opprime en détournant ses capitaux ? de cette 
puissance qui sacrifie l'avenir au présent, et le présent à l'individualité, cette lépre 
contemporaine. Cette puissance égoiste, cosmopolite, qui s'empare de tout, ne pro- 
duit rien et n'est infiniment liée qu'à elle-méme ; souveraine des souverains qui 
ne peuvent sans elle ni faire la guerre ni demeurer en paix ; et qui s'enrichit égale- 
ment de leur prospérité et de leur ruine, des biens du peuple qu'elle partage, de 
leurs maux qu'elle accroit ? (Alphonse de Beauchamp, Mémoires tirés des papiers 
d'un homme d'État. Paris: Michaud 1836, vol. 11, p. 46)16 


The ‘high finance’ sense of capitalisme, however, does not seem to have had a wide 
circulation. We meet it again in 1822 in Georges Laurent Aubert du Petit-Thouars’ Tou- 
jours la guerre au cadastre français, where it is used as antonym of propriété, designating 
a society dominated by rentiers rather than the landowning class: 


Deux individus, l'un capitaliste et l'autre propriétaire, ont chacun vingt-cinq mille 
livres de rente ; [...]. Ainsi la propriété, seul et véritable soutien des monarchies, 
perd tous les jours en France de son ascendant au profit du capitalisme qui de sa 
nature tend toujours au républicanisme : chaque jour nous le prouve. (Georges Lau- 
rent Aubert du Petit-Thouars, Toujours la guerre au cadastre francais, Paris: Trouvé 
1822, p. 42) 


Significantly, the word is written in italics in order to highlight its novelty. Our third 
example appears three years after the publication of Beauchamp's work, in which the 
anonymous observer's invective quoted above had been made public, in Pons Louis 
Francois de Villeneuve's De l'agonie de la France: 


Avec le malaise ou l'instabilité de la fortune privée, concorde le malaise encore 
plus pénétrant de la fortune sociale : et un mal nouveau, le capitalisme, insinuant 
et dangereux serpent, étouffe en ses plis et replis l'une et l'autre. [...] Autre et plus 
féconde proie est pour le capitalisme la fortune publique. Il en pompe les budgets 


16 What should one say about this new power of capitalism, which arose from the commerce that it ruins and 
with all its immorality succeeded the highly moral power of agriculture that it oppresses by diverting its 
capital? About this power which sacrifices the future to the present, and the present to individualism, the 
leprosy of our days. This egoistical, cosmopolitan power that grabs everything, does not produce anything 
and is only infinitely tied to itself; sovereign of sovereigns, who cannot without it make war nor remain in 
peace; and that enriches itself both by their prosperity and their ruin, at the expense of the goods of the 
people that it divides up, of their troubles that it increases?] 

V (Two individuals, a capitalist and a landowner, both have an income of 25,000 pounds; [...] In that way 
ownership, the only true support of monarchies, loses influence day by day to the benefit of capitalism, 
which by its very nature tends towards republicanism: each day proves this to be the case.] 
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par la rente ; il fait comme à son gré la paix ou la guerre. (Pons Louis François de 
Villeneuve, De l’agonie de la France, Paris: Perisse 1839, pp. 139-140)!8 


These three examples of capitalisme are still transparently tied to the old sense of the 
word capitaliste, referring to a very wealthy individual lending his money at interest or 
placing it in bonds or shares. What is less immediately obvious is the patterns of word 
formation by means of which this word came into being. Was it derived from capitaliste 
by affix substitution? Was it an independent derivation from capital? Nouns in -isme, 
at any rate, were already in use at that time for designating economic systems, witness 
colbertisme (1775, TLF-Etym) and mercantilisme (1809, TLF-Étym).? Thus, from a chrono- 
logical perspective, these words could have served as models for capitalisme. The corre- 
sponding nouns in -iste, colbertiste and mercantiliste, designated the supporters of the 
respective doctrine. Since capitaliste did not refer to a supporter, but to a profession or 
occupation, capitalisme, for semantic reasons, could not be derived by affix substitution 
according to a proportional analogy of the kind colbertiste : colbertisme = capitaliste : x. 
The more plausible solution, therefore, is to consider capitalisme to have been an inde- 
pendent derivation on the basis of capital, following the general pattern noun + -isme 
(economic) system somehow related to N°. 


4.3 CAPITALISM as the antonym of SOCIALISM 


As we saw in Section 3.5, CAPITALIST acquired the sense 'entrepreneur' after having 
crossed the Channel (and the Atlantic), a sense that migrated back to France from the 
1830s onwards, where it has cohabitated with the original sense ever since. Capitaliste, in 
that way, became the antonym of ouvrier, travailleur (both worker") and prolétaire ‘pro- 
letarian’, just like capital ‘capital’ had become the antonym of travail ‘work’. This lexical 
opposition simply reflected an extra-linguistic phenomenon, namely the well-known so- 
cial divide created by the Industrial Revolution. In the 1840s, French capitalisme was also 
attracted by this lexical field and thereby was converted into the standard designation 
of the new economic system characterized by the exploitation of workers in factories 
owned and often run by a small group of capitalists/entrepreneurs. Here are some of the 


first examples of this new sense, which are probably attributable to Louis Blanc:?? 


Une lutte récemment engagée entre Lamartine et L. Blanc a donné naissance à 
un nouveau mot ; le capitalisme. Ce n'est pas au capital, s'écrie ce dernier, que 
nous avons déclaré la guerre, mais au capitalisme ; c'est-à-dire, sans doute, aux 


I5 The difficulties and instability of private fortunes matches the even greater difficulties of public fortune: 
and a new evil, capitalism, this insinuating and dangerous snake, suffocates in its folds the one and the 
other. (...) Another, even more fertile prey for capitalism is the public fortune. It sucks the budgets by 
means of government bonds; it makes war and peace as it pleases.] 

Similar formations from outside the economic sphere were already older; see marianisme (1665, TLF-Étym), 
spinozisme (1685, TLF-Étym), etc. 

20See alreadySilberner & Febvre (1940). 
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capitalistes. (Mémoires de l’Académie royale des sciences, belles-lettres et arts de Lyon 
1, 1845, p. 282, n. 1)?! 


L'orateur compare la féodalité ancienne avec le capitalisme actuel. La féodalité pro- 

tégeait du moins l'exploitation de la terre, et par conséquent le travail de l'ouvrier, 

tandis que le capitalisme exploite l'ouvrier lui-méme. (L’Ami de la religion 138, 1848, 
22 

p. 621) 


In this new 'economic system' sense, capitalisme became the antonym of an alterna- 
tive system where the workers themselves would own the capital that forms the basis 
of their activity. Avril, V. Histoire philosophique du crédit (Paris: Guillaumin 1849, vol. 1, 
p. 153) already explicitly opposed CAPITALISM and socIALISM: “la différence radicale qui 
sépare le capitalisme du socialisme”. Socialisme (1831, TLFi) had already been in use 
for more than a decade when capitalisme in this new sense appeared, and communisme 
(1840, TLFi) for a few years. Both may well have served as its immediate models. 

The case of capitalisme in the sense discussed here aptly illustrates the complex factors 
that come into play in the creation and diffusion of a neologism. The TLFi's statement 
that it is composed of a base capital and a suffix -isme is acceptable as a synchronic, 
though not particularly revealing, description of the word's internal makeup, but hardly 
qualifies as an etymology doing justice to the circumstances of the word's creation. At 
the outset, we have to admit that the lack of documentation does not yet allow us to 
gain full certainty about how it came into being, the most plausible scenario being the 
following: Assuming that the ‘high finance’ sense was known to the coiner, which seems 
likely, we should consider the process as one of semantic change, a conceptual adapta- 
tion of the ‘high finance’ sense to the new situation of capitalists acting themselves as 
entrepreneurs, and not just as financiers. From that perspective, the new lexical opposi- 
tion with socialisme and communisme could be viewed either as a consequence of this 
conceptual change, or as its trigger. In fact, the relevant meaning of these two terms, 
namely an ‘economic system where the means of production pertains to the workers or 
to society as a whole’, called for a designation for the opposite concept of an economic 
system where the means of production was concentrated in the hands of a small group 
of wealthy individuals. Since this means of production was referred to technically as cap- 
ital and the entrepreneurs had come to be called capitalistes, capitalisme was a natural 
choice. This reconstruction of the word’s origin also neatly explains why the word was 
used with negative connotations right from the beginning: it was launched by the op- 
ponents of capitalism, while capitalists themselves and circles close to them used to call 
the then prevailing economic system libéralisme (économie de marché ‘market economy’ 
is of much more recent vintage). The transition from the ‘high finance’ sense to the 'eco- 
nomic system’ sense was therefore essentially a process of conceptual rearrangement 
within an existing lexeme. Nevertheless, word formation also came into play, namely by 


la quarrel that recently opposed Lamartine to L. Blanc has given rise to a new word, capitalism. It is not 
to capital, claims the latter, that we have declared war, but to capitalism; that is, no doubt, to capitalists. ] 

22 [The speaker compares feudalism with present-day capitalism. Feodalism at least protected the exploitation 
of the land, and hence the activity of the worker, while capitalism exploits the worker himself.] 

23 [the radical difference that opposes capitalism and socialism] 
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licensing the pattern noun + -isme with the overall meaning ‘system somehow related 
to N’ (note that both socialisme and communisme have adjectival bases; therefore, strict 
proportional analogy with these two words would not suffice). 


4.4 The further fate of CAPITALISM 


The French neologism capitalisme in its “economic system’ sense had an immediate and 
resounding international success in the wake of the 1848 revolution. I will not describe 
here the diffusion of the term in different European languages,”* but concentrate instead 
on its further development in French. 

By a simple metonymic process, designations of systems and similar abstract entities 
are routinely taken to refer to the persons who represent or support the system. Such 
was also the case with capitalisme. The first example of Section 4.3 could already be 
interpreted in that sense. Here is a later and clearer example of this collective sense 
(Burg, Joseph De la vie sociale... Rixheim: Sutter 1885, p. 739): “Le capitalisme, dur et 
arrogant, coudoie le paupérisme, exaspéré et découragé”? 

A more interesting conceptual change occurred at the beginning of the 26" century. 
At that time, academic circles began using the term not only to refer to the contempo- 
rary economic system, what we now call INDUSTRIAL CAPITALISM, but also to economic 
systems of past times that, in their opinion, presented sufficient similarities with the 
contemporary system to be called CAPITALISM. Proto-capitalism was located in the Re- 
naissance, in the Middle Ages, or even in Antiquity. This conceptual change, which was 
the result of conscious conceptual manipulation for scientific purposes, resulted in a 
more abstract concept of capitalism, freed from some of the more contingent aspects of 
19^ century industrial capitalism, as well as its negative overtones. In France, the his- 
torian Henri Hauser was the first to deal with the origins of capitalism in Les Origines 
du capitalisme moderne en France (Paris: Larose) in 1902. However, the international suc- 
cess of this scientific sense was certainly due to the publication, some months before, of 
Werner Sombart’s monumental Der modern Kapitalismus (Leipzig: Duncker & Humblot 
1902). If Hauser had been inspired by Sombart, the new sense would have to be classified 
as a calque. 


5 CAPITALIST going adjectival 


CAPITALIST, as we saw in Section 3, started out as a noun, and it remained exclusively 
nominal until the end of the 18" century. It is at that time when French capitaliste devel- 
oped adjectival uses that are still parts of the language. Three different adjectival senses 
must be distinguished: 1. ‘owning (a huge amount of) capital’, 2. ‘of capitalists’, and 3. ‘of 
capitalism’. 


4For German, see Hilger (1982). 
*5 (Capitalism, hard and arrogant, rubs shoulders with pauperism, exasperated and discouraged.] 
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5.1 Capitaliste adj. ‘owning (a huge amount of) capital’ 


As early as 1790, Charles-Nicolas Ducloz-Dufresnoy, in his Observations sur l'état des 
finances, quotes a “publiciste” called Cerruti who wrote: 


On ne peut appauvrir la Capitale sans appauvrir les Provinces dont elle assemble, 
grossit, répartit et multiplie les richesses territoriales et industrielles. 

Voilà la véritable idée d'une Capitale. 

Voilà la véritable idée des Capitalistes. 

Le peuple Capitaliste est composé de tous ceux qui par leur économie ou par 
leur activité, ont formé des trésors disponibles préts à circuler, préts à se reposer, 
préts à se transformer en papier, préts à se réaliser en terres. (Charles-Nicolas 
Ducloz-Dufresnoy, Observations sur l'état des finances, Paris: Clousier 1790, pp. 14- 
15)26 


In the first half of the 19" century this possessive use of capitaliste established itself 
in wider circles, as the following examples show: 


l'aristocratie territoriale adoucit vis-à-vis des campagnes l'aristocratie capitaliste 
(Laborde, Alexandre de Des aristocraties représentatives. Paris: Le Normant 1814, p. 


comme s'il ne suffisait pas [...] d'un imprimeur capitaliste ou laborieux pour mul- 
tiplier ces produits (Revue encyclopédique, t. 49, janvier-mars 1831, p. 452)?3 


[la législation des Émigrés] a rendu le peuple propriétaire et la noblesse capitaliste 
(Lahaye de Cormenin, Louis-Marie de Droit administratif. Paris: Thoral 1840, t. 1, p. 
xxxvii)? 


La bourgeoisie moderne [...] forme une espèce d'aristocratie capitaliste et foncière, 
[...]. (Proudhon, Pierre-Joseph Organisation du crédit et de la circulation. Paris: Gar- 
nier 1848, p. 21)? 


Ce n'est pas la bourgeoisie qui est boursiére, c'est la société tutta quanta qui veut 
étre capitaliste en exploitant les éventualités des échanges. (Bianchini, Lodovico 
La science du bien-étre social. Bruxelles: Librairie universelle 1857, p. 351?! 


26 [One cannot make the capital poorer without making poorer the provinces whose agricultural and indus- 
trial wealth it assembles, increases, distributes and multiplies. / This is the true idea of a capital. / This 
is the true idea of capitalists. / The capitalist people is composed of all those who through their savings 
and activity have formed treasures ready to circulate, ready to lie idle, ready to be transformed into paper, 
ready to be realized as landed property.] 

27[the landed aristocracy makes the capitalist aristocracy more acceptable for the countryside] 

?5[as if it were not enough [...] to have a well-capitalized or hard-working type-setter in order to multiply 
these products] 

2% [[the legislation on emigrants] has turned the people into owners and the aristocracy into capitalists] 

30[The modern bourgeoisie [...] forms a kind of capitalist and landed aristocracy] 

31[It is not the bourgeoisie who is crazy about the stock market, it is the entire society that wants to be 
capitalist by taking advantage of the opportunities of trading.] 
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From a linguistic point of view, the meaning ‘owning (a huge amount of) capital’ 
constitutes a case of noun-adjective conversion, the base being constituted by the noun 
capitaliste with the meaning ‘person owning (a huge amount of) capital’. This conversion 
pattern does not seem to have had any direct model among words in -iste, none of which 
had a possessive meaning, by the way, if we exclude obsolete actioniste ‘shareholder’, 
which was also of Dutch origin. As argued in Section 3.2, capitaliste should be classified 
as a marginal member of the agentive niche represented by words such as aubergiste 
‘innkeeper’, copiste ‘copyist’, ébéniste ‘cabinetmaker’, latiniste ‘Latin scholar or student’, 
psalmist *psalmist'. Such nouns, however, do not seem to have developed adjectival uses 
(of the relevant kind), according to the information provided by the TLFi.*? The model 
must therefore be sought outside derivative patterns in -iste. 


5.2 Capitaliste adj. ‘of capitalists’ 


The second adjectival sense - which, incidentally, the TLFi fails to mention - corresponds 
to a relational use referring to the corresponding noun capitaliste. Again, we find one 
early outlier in 1791, this time in a translation of Adam Smith’s Inquiry into the Nature 
and Causes of the Wealth of Nations : 


Lorsque ces compagnies [...] commercent avec des capitaux réunis, et que chacun 
des membres a sa part dans le bénéfice commun ou dans la perte commune, en 
proportion des fonds qu’il y a mis ; on les appelle compagnies CAPITALISTES. (Adam 
Smith, Recherches sur la nature et les causes de la richesse des nations, translated by 
J. A. Roucher, Paris: Buisson 1791, vol. 4, p. 90) 


This passage translates the following one from Smith’s original (I quote here from the 
gth edition, where, as we can see, joint stock company corresponds to the translator’s 
compagnie capitaliste). 


When they trade upon a joint stock, each member sharing in the common profit 
or loss in a proportion to his share in this stock, they are called joint stock compa- 
nies. (Adam Smith, Inquiry into the Nature and Causes of the Wealth of Nations, oth 
edition, London: Strahan 1799, vol. 3, p. 110) 


Compagnie capitaliste must therefore be considered to be a neologism created by the 
translator. The only other example provided by Google Books until the mid-19" century 
is the following, which is obviously inspired by the example just quoted: 


La confection ou entretien d'un canal navigable qui ne peuvent guére étre exécutés 
que par des compagnies capitalistes, sont des entreprises qui portent avec elles le 
privilége qui garantit aux entrepreneurs le bénéfice qu’ils doivent en retirer. (Roux, 


32 Appositions such as rabbin cabaliste ‘cabalist rabbi’, moine copiste ‘monk copyist’, ouvrier ébéniste ‘cabinet 
worker’, etc. are classified as adjectival in the TLFi, but this is highly questionable. Some of the nouns 
quoted are indeed used as adjectives, but in a relational sense (e.g. la tradition ébéniste ‘the tradition of 
cabinet-making’, etc.). 
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Vital De l'influence du gouvernement sur la prospérité du commerce. Paris: Fayolle 
1800, p. 257)? 


Overall, however, Rouchet's neologism did not catch on. The more common way 
throughout the 19th century of denominating a company composed of various capitalists 
in French was compagnie de capitalistes ‘company of capitalists’ or société de capitalistes 
‘society of capitalists’, both amply attested since the time of the French Revolution. 

On a larger scale, the relational sense ‘of capitalists’ only appears from the second 
half of the 19" century onwards. These examples, it seems, were independent from the 
use of capitaliste by Roucher in 1791 in the term compagnie capitaliste. It is not always 
easy to distinguish the relational sense ‘of capitalists’ from the sense ‘of capitalism’, 
since capitalisme can also be understood metonymically as the totality of capitalists. In 
the following list, I have chosen examples where reference to capitalists seems more 
plausible than to capitalism as an economic system. 


[...] à fin de se délivrer de l'exploitation capitaliste et usuraire, comme ils se sont 

délivrés de la tyrannie monarchique et jesuitique (Eugène Sue, Mystères du peuple, 

1851, vol. 2, p. 90, quoted in: Archiv des Criminalrechts. Neue Folge. Jahrgang 1851, 
34 

p. 57) 


Comme nous le disions hier, la conjuration capitaliste, l'alliance offensive et défen- 
sive du privilége contre le prolétariat est formée ; il y a entente cordiale entre tous 
ces hommes que nous supposions ennemis : [...]. (Proudhon, P.-J. Mélanges. Articles 
de journaux 1848-1852. Premier volume. Paris: Lacroix, Verboeckhoven & Cie 1868, 
p. 229? 


la tyrannie capitaliste et mercantile (Colins, Jean Guillaume L'économie politique 
source des révolutions et des utopies prétendues socialistes. Paris: Librairie générale 
1856, p. Sen 


Ce sera donc bien une association ouvriére. — Ce sera une association capitaliste 
ou [...] le travail sera subordonné au capital. (Journal des économistes, t. 15, juillet à 
septembre 1869, p. 172)" 


la classe capitaliste et la classe ouvriére [...] dans le milieu capitaliste (Marx, Karl 
emphLe capital. Tr. de J. Roy revisée par l'auteur. Paris: Lachatre 1872, pp. 248, 
285)°8 


33[The building and maintenance of a shipping canal, which can hardly be undertaken but by a capitalist 
company, are enterprises that come with a privilege that guarantees the entrepreneurs the profit they can 
make on it.] 

Y [in order to free themselves from capitalist and usurious exploitation, as they had freed themselves from 
monarchic and jesuitic tyranny] 

35[As we said yesterday, the capitalist conspiracy, the offensive and defensive alliance of the privilege against 
the proletariat already exists; there is an entente cordiale between all these men that we deemed ennemies] 

36 [the capitalist and mercantile tyranny] 

37 [This will therefore indeed be an association of workers. — This will therefore indeed be a capitalist asso- 
ciation where work will be subordinated to capital] 

38 [the capitalist class and the working class (...) in capitalist circles] 
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Marx est donc bien loin d’appeler subjectivement le profit capitaliste un vol (Revue 
internationale du socialisme rationnel, t. 8, 1883, p. 147)°° 


le député Rasseneur parlerait de "l'oppression capitaliste et de la revanche prolé- 
tarienne" (Bonnetain, Paul emphL'Opium. Paris: Charpentier 1886, p. 581)? 


l'avidité capitaliste contraint les mécaniciens des chemins de fer à effectuer des 
journées de travail de dix-huit et vingt heures (La Revue socialiste, t. 10, 1889, p. 
685)4 


la moyenne de la vie ouvrière est inférieure à la moyenne de la vie capitaliste (La 
Réforme sociale, t. 25, 1893, p. 467)? 


incapables [...] d'opposer aux exigences capitalistes une résistance efficace (La 
Société nouvelle, t. 2, 1894, p. 448)? 


These examples should suffice to prove the existence of the relational sense 'of cap- 
italists' from the mid-19'^ century onwards. This relational use followed a pattern of 
conversion turning personal nouns into relational adjectives that was already quite well 
established by the middle of the 19 century, even with nouns in -iste (see Rainer to 
appear). Outside nouns in -iste, we find the relational use of ouvrier in collocations such 
as association ouvrière ‘workers’ association’ and classe ouvrière ‘working class; lit. work- 
ers' class' as early as 1802 in the TLFi. The same relational sense is also attested in the 
TLFi for prolétaire (in the example from Bonnetain above, though, the synonymous suf- 
fixal derivative proletarien is used). Since the noun capitaliste by the mid-19'^ century 
had become the antonym of ouvrier and prolétaire, it could well be that its relational use 
was induced by the relational use of these two antonyms. There is no need to choose 
between these two hypotheses: the influence of ouvrier and prolétaire may well have 
worked in tandem with the pattern converting nouns in -iste into relational adjectives. 


5.3 Capitaliste adj. ‘of capitalism’ 


The relational sense ‘of capitalism’ was also established in the French language in the 
middle of the 19" century. As we saw in Section 4, capitalisme in the relevant sense 
was itself a neologism at that time. Here are some early examples in which the sense ‘of 
capitalists’ definitively seems less adequate than the sense ‘of capitalism’. 


Le systéme capitaliste a été établi en France sous des conditions bien moins prop- 
ices (Sagra, Ramon de la Révolution économique. Paris: Capelle 1849, p. 81)*4 


le plus grand écrivain de vos théories capitalistes (Avril, V. Histoire philosophique 
du crédit. Paris: Guillaumin 1849, p. soi? 


39. 
40 
41 


[ 
[ 
[ 
2i 
[ 
[ 
[ 


Marx is therefore far from subjectively calling capitalist profit theft] 

MP Rasseneur was said to speak about “capitalist oppression and proletarian revenge”] 
capitalist greed obliges the train drivers to work for 18 or 20 hours] 

the lifetime of a worker on average is shorter than a capitalist’s lifetime] 

unable to counter the demands of capitalists with an efficient opposition] 

The capitalist system has been established in France under much less favourable conditions] 
the greatest writer on your capitalist theories] 


43 
44 
45 
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la négation du régime capitaliste, agioteur et gouvernemental, qu’a laissé aprés 
elle la premiére révolution (Proudhon, Pierre-Joseph Idée générale de la révolution 
au XIXe siecle. Paris: Garnier 1851, p. 107)*° 


Le résultat sera donc un accroissement de population dans le pays capitaliste B. 
(De Laveleye, Emile Etudes historiques et critiques sur le principe et les conséquences 
de la liberté du commerce international. Paris: Guillaumin 1857, p. 88)47 


From a present-day perspective, this usage seems straightforward, since most nouns 
in -isme referring to ideologies and similar notions are flanked by a relational adjective in 
-iste: marxisme/ marxiste, racisme/raciste, etc. Morphologically, the relationship between 
such pairs is one of affix substitution. What is crucial in our context is whether this rela- 
tion of affix substitution was already operative in the middle of the 19th century. The TLFi 
does not provide reliable evidence bearing on this question, since in most entries a date 
of first attestation is only given for the nominal use of - iste. However, relevant examples 
are not difficult to come by. In many cases, one may waver between the interpretations 
‘of Xists’ and ‘of Xism’: “mouvement anarchiste" (d'Ivernois, Francis Les cinq promesses. 
Londres: Cox 1802, p. 149), for example, could be glossed equally naturally as ‘movement 
of anarchists’ and ‘movement inspired by anarchism’, “journal légitimiste" (Procés de M. 
Gisquet contre Le Messager. Paris: Pagnerre 1839, p. 1) as ‘newspaper of/for legitimists' 
and ‘newspaper inspired by/defending legitimism’. In “une thése matérialiste" (Gibon, H. 
Fragments philosophiques. Paris: Hachette 1836, p. 69), however, 'a dissertation inspired 
by materialism' would seem to be the only reasonable gloss. 

We can therefore safely assume that the ‘of capitalism’ sense could be derived, by the 
middle of the 19^ century, from capitalisme by means of affix substitution. For the sake 
of completeness, however, let us still check an alternative possibility which some might 
wish to entertain. As we have seen, CAPITALIST already spilled over to the Anglo-Saxon 
world at the end of the 18" century and since then it has been a much-used term in the 
English language. Could it not be, therefore, that the relational sense in question was 
simply due to a calque from English? In order to answer this question, let us observe 
the dates of first attestation? of the English collocations corresponding to those quoted 
above for French: capitalist country (1861), capitalist system (1862), capitalist regime (1863), 
capitalist theories (after 1900). As we can see, the English collocations follow the French 
ones by a lapse of time of some 10 years. It may therefore safely be assumed that English 
imitated French, not vice versa. 


[the negation of the capitalist, speculative and governmental regime left over from the first revolution] 
^" [The result will therefore be an increase in population in the capitalist country B.] 
^5 Using the first book allowing a full view of the text, front matter included, in Google Books. 
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6 A 20"-century codicil: CAPITALIST ‘supporter of 
capitalism’ 


As we saw in Section 3, n the middle of the 19th century, the ‘entrepreneur’ sense had 
been added to the ‘monied man’ sense. In the 20° century, a third sense was added to 
these two, namely that of 'supporter of capitalism', which has largely superseded the 
other two. In the second half of the 19t! century, CAPITALISM had evolved from a name 
characterizing an economic system to that of an ideology. Especially after the interna- 
tional success of Marxism, CAPITALISM became the antonym of coMMUNISM, which could 
also denote both an economic system and an ideology. Due to this status of CAPITALISM 
as an antonym of COMMUNISM, CAPITALIST followed COMMUNIST in designating a person 
that embraced the ideology expressed by the corresponding word in -1sM. The following 
example illustrates this last transformation of CAPITALIST with French capitaliste: “Outre 
la question de l'attitude du Chrétien, un point irrite particuliérement André Gide ; c'est 
le reproche qui lui est fait d'étre à la fois capitaliste et communiste et il s'ingénie a re- 
tourner l'accusation contre les chrétiens??? (Fillon, Amélie François Mauriac. Paris: So- 
ciété Francaise d'Éditions Littéraires et Techniques 1936, p. 330). What the author wanted 
to say here is that Gide was accused of having embraced the ideologies of capitalism and 
communism at the same time, not that he was a financier, investor, or entrepreneur. In 
this latest sense one can even be a capitalist without possessing any money or property. 

From a linguistic point of view, this last transformation of CAPITALIST is to be regarded 
as a case of affix substitution on the basis of CAPITALISM, as the gloss ‘supporter of cap- 
italism’ suggests. What is less easy to tell is whether this affix substitution first took 
place in French or in some other European language, notably English or German. The 
question is almost impossible to answer since at that time these three languages were 
already in perfect harmony concerning CAPITALIST and CAPITALISM as well as the -15M/- 
IST pattern. In French, for example, this kind of affix substitution could base itself on a 
sizeable number of potential models: an anarchiste was a supporter of anarchisme, a com- 
muniste a supporter of communisme, etc. It is worth mentioning that, from a historical 
perspective, the derivative in -iste tended to occur earlier than that in -isme, but at some 
point in time the names of the supporters came to be reinterpreted as dependent on the 
names of the doctrines. 


7 Conclusion 


After having accompanied CAPITALIST and CAPITALISM in their unfolding since the ju 
century, it is time to draw some general conclusions about the relationship between word 
history and word formation and to highlight the role of the lexeme in this affair. 


?? [Apart from the question of the attitude of the Christian, one point in particular irritates André Gide: the 
reproach that is addressed to him of embracing at the same time the ideology of capitalism and communism, 


and he is at pains to return the charge against the Christians.] 
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As we have seen, semantic change, borrowing and word formation have all substan- 
tially contributed to the evolution of these two key words of our politico-economic vo- 
cabulary. And in each of these three modes of lexical enrichment the lexeme has been 
seen to play a key role. What is traditionally called semantic change in reality should bet- 
ter be called conceptual change, as Andreas Blank convincingly argues in his 1997 book. 
The semantic changes observed in the history of CAPITALIST and CAPITALISM affected 
holistic concepts tied to lexemes, in close interaction with changes in extra-linguistic 
reality, not affixes or roots. Borrowing also repeatedly played a role: in the migration of 
CAPITALIST from the United Provinces to France, from France to the Anglo-Saxon world 
and back again, to mention just those involving French. Now, calquing is a process that 
is also located at the level of the lexeme. It can be conceived of as an analogical process 
where model and copy are located in different languages (though in the same speaker’s 
mind). If seen in this light, calquing is close to word formation, which is also best con- 
ceived of as an analogical, pattern-based process. This is particularly obvious in the case 
of affix substitution, which played a prominent role in derivatives with -IST and -1sM. 

We have also seen that a full understanding of the evolution of our two words requires 
taking into consideration the structure of the lexicon at the relevant points in time. A 
lacuna in the lexicon may induce semantic change, as Passow already surmised in rela- 
tion to the rise of the ‘entrepreneur’ sense of English capitalist. The absence of a specific 
word for ‘entrepreneur’ around 1800 may have prompted the English speakers to adapt 
the meaning of capitalist, originally referring to a rich money lender or investor, in or- 
der to fill this empty slot. Another case in point may have been the introduction of the 
“economic system’ sense of French capitalisme in the 1840s, which filled the need for an 
antonym of socialisme and communisme. Similarly, the specific configuration of a seman- 
tic field may induce change, as we have seen in the case of the opposition ‘entrepreneur’ 
vs. ‘worker’, which may have helped to establish the relational use of French capitaliste 
in the ‘of capitalists’ sense, providing a ready counterpart for the already established 
relational use of ouvrier and prolétaire. The same search for formal/semantic parallelism 
was probably also operative in the rise of the ‘supporter’ sense of French capitaliste in 
the 20" century. These latter processes can be accounted for straightforwardly as pro- 
portional analogies. 

At many points in our discussion we have seen that the French historical dictionaries 
that we have at our disposal, notably the TLFi, only provide a shaky basis for detailed 
investigations into the history of word-formation patterns in post-Renaissance French. 
In some sense, the TLFi is a marvel of a dictionary, second probably only to the OED. 
Nevertheless, it is obvious in many entries that the lexicographers where overwhelmed 
by the wealth of raw data at their disposal and hampered by the lack of a sound theory 
of word formation (or an inconsistent application of the theory, if they had one). The 
relationship between words in -isme and the corresponding relational adjectives in -iste, 
for example, is not given a separate etymological treatment but identified with that of 
nouns in -iste, which are themselves handled in different ways in different entries: 


*» 


Anarchiste: “Dér. du rad. de anarchie"; suff. -iste 


*» 


Animiste: “Dér. du rad. du lat. anima (âme*); suff. -iste 
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*» 


Colbertiste: *du rad. de colbertisme, suff. -iste 


Cubiste: “Dér. de cube"; suff. -iste*” 


*» 


Fétichiste: “Dér. de fétiche* formé sur le modèle de fétichisme"; suff. -iste 


» 


Piétisme: “Dér. de piétiste"; suff. -isme* 


Quiétiste: “Dér. de quietisme* par substitution du suff. -iste* à -isme" 


In a proper etymological treatment, each step in the history of a word, which roughly 
corresponds to a word's subentries in a well-ordered dictionary, must be provided with 
a separate etymological explanation, and each explanation should explicitly name the 
change according to a catalogue of standard mechanisms of lexical change. In the case of 
semantic change and borrowing, a list of universal mechanisms such as calque, metaphor, 
and metonymy will generally be sufficient, though some of these mechanisms also show 
language-specific patterns that should then be named explicitly? For word formation, 
by contrast, it is vital to make sure that the pattern alluded to in a certain etymological 
explanation was productive at the moment in question. 

The rather glaring shortcomings of the TLFi in that respect are now being emended 
by the TLF-Étym project, to which I am happy to contribute from time to time. Word 
histories in the TLF-Étym style are a necessary prerequisite for a history of word for- 
mation in modern French,” which constitutes a great desideratum. At the same time, 
detailed studies on the history of single word-formation patterns would yield important 
contributions to historical lexicography. The two fields are so intimately intertwined, 
that they of necessity must evolve in tandem. 
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Cet article aborde la question de la catégorie des construits morphologiques, en particulier 
le cas des suffixés en -iste. Ceux-ci ont la particularité, pour la plupart, d’étre ambigus du 
point de vue de la catégorie dans la mesure où ils peuvent être noms et/ou adjectifs. Nous 
montrons qu'il existe deux types de suffixés en -iste : les uns sont fondamentalement des 
noms, les autres sont fondamentalement des adjectifs, qui peuvent néanmoins étre employés 
comme noms sous certaines conditions. Pour ce dernier cas nous proposons une analyse en 
termes de coercion. 


1 Introduction 


Cet article se focalise sur les catégories construites par la suffixation en -iste. Celle-ci 
souléve en effet des questions intéressantes car les dérivés qu'elle sert à former semblent 
appartenir à deux catégories différentes, celles du nom et de l'adjectif. 

Les dérivés en -iste ont déjà fait l'objet de plusieurs études, notamment par (Dubois 
1962, Corbin 1988, Roché 2011). Notre étude se distingue des précédentes dans la mesure 
où nous nous focalisons ici sur les catégories d'output de la suffixation en -iste. En cela 
nous adoptons un point de vue différent de celui de (Roché 2011) qui met l'accent sur la 
sémantique de la suffixation, indépendamment des catégories impliquées. Nous nous in- 
téressons de notre cóté aux rapports catégoriels des dérivés en -iste qui peuvent souvent 
étre adjectifs et noms. Puisque ces dérivés sont le produit d'une construction morpholo- 
gique, on peut se demander si une catégorie est première, construite par la morphologie, 
et à partir de laquelle serait obtenue l'autre catégorie. Si c'est le cas, se posent alors deux 
questions : l'identification de la catégorie premiére et le mode de formation de l'autre 
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catégorie. On peut au contraire envisager une construction des deux catégories en paral- 
léle, ou encore s'interroger sur une indétermination catégorielle des construits. C'est à 
ces questions que nous nous proposons de répondre. 

Nous ne mettrons pas en regard, dans cet article, les dérivés en -iste avec les dérivés en 
-isme pour différentes raisons. D'une part, la question des rapports entre les suffixations 
en -iste et en -isme a déjà été traitée, notamment par (Corbin 1988) et plus récemment 
et avec beaucoup de détails par (Roché 2011). D'autre part, pour la question qui nous 
intéresse, c'est-à-dire celle des rapports entre catégories adjectivale et nominale des dé- 
rivés en -iste, analyser les suffixés en -iste comme dérivés ou construits parallélement 
aux suffixés en -isme ne résout pas le probléme. Enfin, il existe un certain nombre de 
dérivés en -iste qui ne présentent aucun correspondant en -isme, par exemple CHIMISTE, 
FLEURISTE, GARAGISTE, PIANISTE, ce qui nous semble justifier l'étude des suffixés en -iste 
indépendamment de leur relation avec la suffixation en -isme. 

Dans un premier temps nous présentons notre méthodologie de constitution du corpus 
et d'identification des catégories (S 2). Puis nous présentons notre analyse des suffixés 
en -iste (§ 3 et 4) et montrons qu'il existe deux cas de figure distincts, tant du point de 
vue du sens que du point de vue des catégories. Nous montrons que dans le deuxiéme 
cas la catégorie adjectivale est premiére et la catégorie nominale seconde (§ 5). Pour ce 
dernier cas, aprés avoir envisagé deux analyses possibles, l'ellipse et la conversion, nous 
proposons notre propre analyse, en termes de coercion (8 6). 


2 Méthodologie 


2.1 Constitution du corpus 


Notre étude des noms et adjectifs suffixés en -iste se fonde sur les données de Lexique 3 
(http://www .lexique.org/). Ce lexique comprend 135 000 formes fléchies correspondant à 
55 000 lemmes. À chaque forme sont associées différentes informations telles que la caté- 
gorie, le genre et le nombre pour les noms et adjectifs, le temps, le mode, la personne et le 
nombre pour les verbes, la transcription phonétique, etc. En plus des informations mor- 
phosyntaxiques, Lexique 3 fournit la fréquence des formes fléchies et des lemmes dans 
deux corpus, l'un étant un sous-ensemble de textes littéraires récents tirés de Frantext, 
et l'autre étant un corpus de sous-titres de films. 

Pour mener notre étude des noms et adjectifs en -iste, nous avons dans un premier 
temps extrait de Lexique 3 tous les lemmes se terminant formellement par -iste et caté- 
gorisés comme noms ou adjectifs, avec leurs fréquences dans les deux corpus. Ces deux 
fréquences ont été additionnées pour chaque lemme de façon à ne conserver qu'une seule 
information de fréquence. Dans un second temps, les noms et adjectifs extraits ont été 
mis en regard de maniére automatique afin d'identifier les noms en -iste sans correspon- 
dant adjectival, les adjectifs en -iste sans correspondant nominal, et les cas de paires nom- 
adjectif. Enfin, nous avons validé manuellement les données afin d'écarter les lexémes 
se terminant par -iste mais qui ne sont pas construits (par exemple LISTE, PISTE, TRISTE), 
ainsi que les lexémes qui sont bien formés au moyen du suffixe mais dont la suffixation 
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en -iste ne correspond pas à la derniére opération morphologique effectuée (par exemple 
CHIRURGIEN-DENTISTE, EX-GAUCHISTE, PHOTOJOURNALISTE, NÉO-COMMUNISTE). Au terme 
de la validation manuelle notre corpus d'étude contient, selon l'étiquetage de Lexique 3: 
277 noms en -iste sans adjectif correspondant, 64 adjectifs en -iste sans correspondant 
nominal, et 153 paires nom-adjectif. 

Lors de l'examen des données issues de Lexique 3 l'étiquetage catégoriel des formes en 
-iste nous a paru parfois discutable. En effet, parmi les noms sans correspondant adjecti- 
val dans la ressource nous avons trouvé plusieurs lexémes pour lesquels un adjectif nous 
semble parfaitement possible et est de surcroit attesté, dans le TLFi ou ailleurs. C'est le 
cas par exemple de ABSTENTIONNISTE, CARRIÉRISTE, CHAUVINISTE, POUJADISTE OU UTO- 
PISTE. À l'inverse, les 64 formes en -iste étiquetées comme uniquement adjectivales dans 
la ressource nous ont semblé pouvoir également étre employées comme des noms. Par 
exemple des lexémes tels que DUALISTE, FÉDÉRALISTE, RÉFORMISTE OU STRUCTURALISTE 
peuvent avoir un emploi nominal comme le montrent les exemples (1)-(4) tirés de Fran- 
text. 


(1) ilsollicita les ministres en leur confessant sa vieille amitié pour le fédéraliste ex- 
piré [...] (Balzac, 1843) 


(2) et les réformistes ne furent pas les moins acharnés à défendre les formules an- 
ciennes [...] (Sorel, 1912) 


(3) accréditant ainsi pour longtemps, chez les structuralistes, la thése de l'univocité 
[...] (Hagége, 1985) 

(4) Les résultats que l'on peut obtenir sont trés différents de ceux auxquels visaient 
les dualistes anciens [...] (David, 1965) 


Nous avons donc eu besoin d'établir des critéres afin de déterminer la catégorie des 
formes en -iste. 


2.2 Critéres catégoriels 


Sila distinction entre nom prototypique et adjectif prototypique est clairement établie, il 
existe néanmoins une zone de flou entre ces deux classes, où les oppositions sont moins 
tranchées et ot la distinction entre catégorie et emploi est plus difficile à établir. Nous 
présenterons d'abord, trés rapidement, les critéres des noms et adjectifs prototypiques, 
puis nous listerons les contextes qui peuvent étre ambigus entre les deux catégories. 

La grammaire traditionnelle convoque généralement trois critéres pour distinguer les 
catégories nominale et adjectivale : des critéres morphosyntaxiques, sémantiques et syn- 
taxiques (distribution et fonctions). Ces différents critéres sont résumés dans le tableau 1. 

Relativement opératoires pour distinguer les cas prototypiques, ces critéres ont sou- 
vent été critiqués (cf. par exemple Wierzbicka (1998), Croft (2001, 2002), Dixon & Ai- 
khenvald (2002), Haspelmath (2007) pour ne citer que quelques travaux récents) car ils 
laissent dans l'ombre de nombreux cas d'usage courant qui enfreignent l'un ou l'autre 
de ces critéres, en particulier les constructions prédicatives (5), que la prédication soit 
premiére (5a) ou seconde (5b), et l'épithéte détachée (6). 
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Tableau 1 : Critéres catégoriels 


Nom prototypique Adjectif prototypique 
Critére (ex. table) (ex. grand) 
morphosyntaxique genre inhérent, variation en variation en genre et en nombre 
nombre 
sémantique dénote des entités dénote des propriétés 
distribution est précédé d'un déterminant, peut peut être modifié par un adverbe 


étre expansé par un adjectif, un 
syntagme prépositionnel ou une 


de degré, peut étre suivi d'une ex- 
pansion sous la forme d'un syn- 


relative tagme prépositionnel ou d'une 


complétive 


fonctions typiques sujet, COD, COI attribut, épithéte 


(5 a. Pierre est {intelligent/avocat}. 


b. Jai un ami {intelligent/avocat}. 


(6) Pierre, {intelligent comme toujours/avocat de renom}, a signalé que... 


Dans ces constructions, en effet, un nom comme AVOCAT peut s'employer sans déter- 
minant et manifeste ainsi le méme comportement qu'un adjectif comme INTELLIGENT. 
Tous les noms ne peuvent cependant pas entrer dans ce type de constructions : un nom 
comme TABLE ne présente pas la méme capacité que Avocar, comme le montrent les 
exemples (7). 


(7) a. 


b. * J'ai un meuble table. 


* Ceci est table. 


c. * Ce meuble, table nouvellement achetée, est vraiment superbe. 


Les noms de profession et de fonction sociale, comme Avocar, forment de ce fait une 
classe spécifique. Ce sont sans doute des noms non prototypiques, mais ils répondent 
néanmoins à tous les autres critéres caractérisant les noms, notamment les critéres syn- 
taxiques, distributionnel et fonctionnel. D'autre part, ce comportement caractéristique 
des noms de profession ou de fonction sociale ne les assimile pas non plus pleinement à 
des adjectifs : ils ne peuvent notamment pas étre coordonnés avec un adjectif qualificatif 
comme le montre l'exemple (8). 


(8) ?? Pierre est grand et avocat. 


Nous avons ainsi considéré comme des noms toutes les formes en -iste qui remplissent 
les critéres des noms prototypiques (tableau 1), mais aussi celles qui peuvent étre em- 
ployées sans déterminant dans les contextes (5) et (6) mais ne peuvent pas étre coordon- 
nées avec un adjectif qualificatif comme dans le contexte (8). 
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L'application de ces critéres nous a permis d'identifier deux types de formes en -iste : 
celles qui ne sont employées que comme des noms et celles qui sont doublement catégo- 
risées, nom et adjectif.! Les Sections 3 et 4 décrivent ces deux cas de figure. 


3 Les formes en -iste nominales 


Les formes en -iste ayant un emploi uniquement nominal forment un ensemble relati- 
vement homogéne du point de vue morphologique. En effet, ces noms en -iste dérivent 
quasiment tous de noms (9). Deux exemples (9b) sont construits formellement sur des 
adjectifs mais dérivent en réalité d'unités polylexicales de catégorie nominale : le cri- 
minaliste étudie le droit criminel (sous-type de droit), et l'interniste étudie la médecine 
interne (sous-type de médecine). 


(9) a. AUBERGISTE (<AUBERGE), CAVISTE (<CAVE), DENTISTE (<DENT), MACHINISTE 
(<MACHINE), NOUVELLISTE (<NOUVELLE), VIOLONISTE (<VIOLON) 


b. CRIMINALISTE («DROIT CRIMINEL), INTERNISTE («MÉDECINE INTERNE) 


Dans quelques cas la base est ambigué entre nom ou verbe (10) mais du point de vue 
du sens une analyse à partir du nom est toujours possible lorsque le nom est associé à 
une activité. 


(10)  ARCHIVISTE (<ARCHIVE/ARCHIVER), CARICATURISTE (<CARICATURE/CARICATURER), 
CONTORSIONNISTE (<CONTORSION/CONTORSIONNER), COPISTE (<COPIE/COPIER), 
ILLUSIONNISTE (<ILLUSION/ILLUSIONNER), POLEMISTE (<POLEMIQUE/POLEMIQUER), 
VOCALISTE (<VOCALISE/VOCALISER)* 


Enfin, nous avons trouvé un nom formé sur un sigle, CIBISTE (<CB = CITIZEN-BAND), et 
un autre dérivé d’un verbe ou du nom en -isme correspondant : EXORCISTE (<EXORCISER/ 
EXORCISME). 

Du point de vue du sens ces noms sont fondamentalement des noms de métier ou de 
fonction sociale. Ils correspondent à l'une des deux catégories identifiées par (Wolf 1972), 
l'autre étant celle des noms de partisans. De facon générale ces noms de métiers en -iste 
n'acceptent pas l'emploi adjectival : 


(11) ?? ils sont nombreux à vouloir choisir le métier garagiste 


En (11) garagiste ne semble pas fonctionner comme un adjectif en fonction d'épithéte 
dont le róle serait de qualifier le nom recteur, mais plutót comme un nom. Il existe en 


IBien que menée dans un cadre radicalement différent, cette distinction en deux sous-ensembles rejoint les 
deux cas de figure identifiés par (Dubois 1962) et (Dubois & Dubois-Charlier 1999). 

Dans certains cas la finale du lexéme base est tronqué devant le suffixe -iste, a fortiori si elle comprend déjà 
un [i]. Ainsi pour POLÉMISTE le segment final ique (si la base est nominale) ou iquer (si la base est verbale) 
est tronqué. Cette troncation n'est pas liée à l'ambiguïté catégorielle de la base : elle se retrouve également 
dans FATALISTE, dérivé de FATALITÉ (ou FATALISME). Elle n’est pas davantage spécifique au suffixe -iste et 
s observe assez fréquemment en français et avec différents suffixes. A ce sujet voir (Corbin & Plénat 1992). 
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effet une relation d'hypéronymie/hyponymie entre métier (l'hypéronyme) et garagiste 
('hyponyme).? Il est cependant possible d'en trouver des exemples, comme en (12) : 


(12) Je ne suis pas d'un tempérament archiviste (Le Monde, 9 février 2008)* 


Selon Rainer (2016), la possibilité d'employer un nom de métier (nom d'agent dans ses 
termes) en -iste comme adjectif qualificatif dépend de la facilité avec laquelle on peut as- 
socier au référent du nom une qualité stéréotypique. Dans le cas de ARCHIVISTE on peut 
assez facilement attribuer au référent la qualité d'étre conservateur et ordonné. Cepen- 
dant, tous les noms de métier en -iste n'offrent pas aussi aisément prise aux stéréotypes. 
De ce fait, nous ne suivrons pas Rainer (2016) qui considére qu'il existe, en francais ac- 
tuel, un patron bien établi de formation d'adjectifs en -iste par conversion morphologique 
N > A. Une telle analyse ne nous convainc pas dans la mesure où l'emploi d'un nom de 
métier ou de fonction sociale en position adjectivale ne concerne qu'un petit nombre de 
noms, et ne semble pas étre un processus productif et régulier. 


4 Les formes en -iste doublement catégorisées 


Les suffixés en -iste présentant les deux catégories, nominale et adjectivale, forment en re- 
vanche une classe moins homogéne du point de vue morphologique. En effet, comme l'a 

remarqué Roché (2011), ils peuvent dériver de noms communs (13a), de noms propres (13b), 
d'adjectifs (13c) ou de verbes (13d). Ils peuvent également avoir pour base autre chose 

qu'un lexéme, comme des sigles (13e) ou des syntagmes (13f). 


(13) a. ANARCHISTE (<ANARCHIE), CENTRISTE (<CENTRE), CAPITALISTE (<CAPITAL), 
CYCLISTE («CYCLE), GAUCHISTE (<GAUCHEy), IDÉALISTE («IDÉAL,), HUMORISTE 
(<HUMOUR), NOMBRILISTE (<NOMBRIL), PROGRESSISTE (<PROGRES), SEXISTE (<SE- 
XE), TERRORISTE (<TERREUR) 


b. BOUDDHISTE (<BOUDDHA), CALVINISTE (<CALVIN), FRANQUISTE (<FRANCO), 
GAULLISTE (<DE GAULLE), MARXISTE (<MARX), ORLEANISTE (<ORLEANS), SIO- 
NISTE (<SION), TROTSKISTE (<TROTSKY) 

C. COMMUNISTE («COMMUN), LOYALISTE (<LOYAL), MODERNISTE (<MODERNE), POSI- 
TIVISTE («POSITIF), SIMPLISTE («SIMPLE) 

d. ARRIVISTE («ARRIVER), CONFORMISTE («SE CONFORMER), DIRIGISTE («DIRIGER) 

e. CÉGÉTISTE (<CGT), VETETISTE (<VTT) 


FIL-DE-FERISTE (<fil de fer), JUSQU’ AU-BOUTISTE («jusqu'au bout) 


Pour un certain nombre de lexémes, comme ceux présentés ci-dessous, la base est 
ambigué entre nom et adjectif (14) ou entre verbe et nom (15). Selon Roché (2011), dans 
les cas sous (14) la base formelle (le radical dans les termes de l'auteur) est l'adjectif tandis 
que le lexéme en -iste dériverait sémantiquement du nom. 


3Sur les ambiguités nom vs adjectif en position épithéte, cf. par ex. Noailly (1999). 
^Exemple emprunté à (Rainer 2016). 
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(14) ABSENTEISTE (<ABSENCE/ABSENT), EXISTENTIALISTE («EXISTENCE/EXISTENTIEL), FÉ- 
MINISTE («FEMME/FÉMININ), INDIVIDUALISTE («INDIVIDU/INDIVIDUEL), ROYALISTE 
(<ROI/ROYAL) 


(15) ALARMISTE (<ALARMER/ALARME), RÉCIDIVISTE (<RECIDIVER/RECIDIVE), SEPARATISTE 
(<SEPARER/SEPARATION), TRANSFORMISTE (<TRANSFORMER/TRANSFORMATION) 


D’un point de vue sémantique les suffixés en -iste doublement catégorisés sont en 
revanche plus homogénes : en tant qu’adjectifs ils renvoient a des propriétés compor- 
tementales, idéologiques, morales ou philosophiques. En tant que noms ils désignent 
soit des partisans ou pratiquants d’une idéologie, une philosophie, une discipline ou une 
activité (16), soit des habitués d’un certain comportement (17). 


(16) AUTONOMISTE, CÉGÉTISTE, CYCLISTE, GRÉVISTE, JANSÉNISTE, MONARCHISTE 


(17) ABSENTEISTE, ALTRUISTE, FATALISTE, MATERIALISTE, JE-M'EN-FOUTISTE 


Roché (2011) a également mentionné la possibilité pour les dérivés en -iste de désigner 
des gentilés, comme NORDISTE, et un cas inclassable, celui de UNIJAMBISTE, auquel on 
peut ajouter SIMPLISTE. 

Enfin, d’un point de vue syntaxique, ces formes en -iste doublement catégorisées 
semblent se comporter a la fois comme de vrais noms et de vrais adjectifs. Ce sont de 
vrais adjectifs par les fonctions qu'elles sont capables d'assumer (cf. critéres présentés 
en 2.2), mais aussi par la capacité qu'elles ont à prendre les marques de degré, comme 
en (18). 


(18) a. Le jeune Du Camp devient trés socialiste. (Flaubert, 1850) 
b. Qui est cet électeur frondeur dans ce territoire fortement socialiste ? (Web) 


c. Fourniére ne connaissait pas d’âme plus socialiste et de cerveau plus fécond 
que Leroux. (Web) 


En tant que noms, ces formes se comportent également comme de vrais noms : elles 
peuvent prendre tout type de déterminant : défini (19a-b), indéfini (19c-d), démonstra- 
tif (19e) ou numéral (19f), et peuvent étre employées au singulier comme au pluriel. 
D'autre part, elles ne semblent manifester aucune « déficience catégorielle » selon les cri- 
téres de (Lauwers 2014c) et sont pleinement comptables comme le montre la possibilité 
d'une détermination par plusieurs (19d) ou par un numéral (19f). 


(19) a. Auxélections, il voterait pour le socialiste. (Aragon 1936) 


b. du côté de la Bastille où les socialistes organisaient un grand rassemblement 
(Osmont 2012) 


c. Unsocialiste se leva, mais un second extravagant l'arréta de la main. (Malraux, 
1937) 


d. plusieurs socialistes de Londres étaient venus nous voir pour dissuader Georges 
de se marier à l'église (Torrés 1939-1945) 
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e. il faudrait qu'il devint le point de ralliement de ces socialistes (Barrés 1918) 


f. chez les Alsaciens : on avait repéré deux socialistes (Sartre 1949) 


En outre, ces formes nominales en -iste peuvent, comme n'importe quel nom, étre 
modifiées par un adjectif, un syntagme prépositionnel ou une relative (20) et assumer 
toutes les fonctions nominales (21). 


(20) a. le milord Link est détesté de ses collègues pour être partisan de ce terroriste 
anglais (Stendhal, 1835) 


b. espérons que ce réaliste de profession n'est pas trop romanesque (Sand, 1866) 


c. les fétichistes qui vénéraient certaines parties de son corps (Duvignaux, 1957) 


(21 a. S les communistes suscitent l'admiration (Jablonka, 2012) 
b. COD néanmoins il aimait bien les communistes (Osmont, 2012) 
c. CdN au fond de toutes les théories des communistes (Proudhon, 1840) 


d. CdA Celui-là [...] roide comme un communiste (Balzac, 1846) 


Les formes en -iste doublement catégorisées semblent donc être autant adjectifs que 
noms. Par conséquent la question du rapport entre les catégories nominale et adjecti- 
vale se pose de manière cruciale. La section suivante est consacrée à l'analyse de cette 
question. 


5 Orientation catégorielle 


Les formes en -iste étant morphologiquement dérivées, plusieurs analyses du rapport ca- 
tégoriel entre adjectif et nom sont possibles : soit l'une des deux catégories est construite 
par la suffixation en -iste et l'autre est dérivée, et il s'agit alors de déterminer quelle ca- 
tégorie est première; soit les deux catégories sont formées en parallèle par la règle de 
suffixation. Roché (2011 : 92) considére quant à lui que les dérivés en -iste sont sous- 
spécifiés pour la catégorie et que leur emploi nominal ou adjectival est déterminé par 
le contexte. Nous ne souscrivons pas à cette analyse par indétermination catégorielle et 
pensons au contraire que les dérivés en -iste sont non seulement catégorisés, mais sont en 
premier lieu des adjectifs et que leur emploi nominal est second. Pour arriver à ce résultat 
nous explorons deux critéres : la fréquence des emplois adjectivaux et nominaux (§ 5.1) 
et l'émergence de ces deux emplois en diachronie (§ 5.2). Nous analysons ensuite les 
caractéristiques sémantiques des emplois en tant que noms et en tant qu'adjectifs pour 
montrer l'antériorité de la catégorie adjectivale (§ 5.3). 


5.1 Fréquences 


Afin de déterminer l'orientation de la relation entre deux formes homonymes et de caté- 
gories différentes, Marchand (1964) propose de se fonder sur la fréquence d'emploi des 
deux formes. Selon lui, la forme la plus fréquente est premiére et la moins fréquente est 
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dérivée. Nous avons donc examiné les fréquences adjectivales et nominales des formes 
en -iste doublement catégorisées dans Lexique 3. Ce critére n'a été appliqué qu'aux 153 
paires nom-adjectif issues du lexique. Les formes que nous considérons comme dou- 
blement catégorisées selon les critéres présentés en 2.2 mais qui sont enregistrées dans 
Lexique 3 uniquement comme adjectifs n'ont pas pu étre prises en compte, leur fréquence 
en emploi nominal étant évidemment absente de la ressource. 

Nous avons tout d'abord regardé la fréquence moyenne en tant que nom et en tant 
qu’adjectif pour l'ensemble des 153 formes en -iste doublement catégorisées : elle est de 
1.51 pour les formes adjectivales et de 1.73 pour les formes nominales. La différence est mi- 
nime, d'autant plus qu'une forme joue un róle perturbateur, ARTISTE, qui a une fréquence 
en tant que nom de 86.66, alors qu'elle n'est que de 13.2 en tant qu'adjectif.? Dans la ma- 
jorité des cas en effet (cf. le tableau 2, qui regroupe les fréquences des neuf premières 
formes en -iste de notre corpus), une méme forme posséde une fréquence plus ou moins 
identique en tant que nom ou en tant qu'adjectif (ABSENTÉISTE, ANABAPTISTE), sachant 
que dans certains cas c'est l'emploi nominal qui est un peu plus fréquent (ACTIVISTE), 
alors que dans d'autres c'est l'emploi adjectival (ALTRUISTE). 


Tableau 2 : Fréquence des emplois A et N pour une méme forme en -iste 


Lexéme fréq_A fréq N 
abolitionniste — 0.1 0.24 
absentéiste 0.01 0.01 
activiste 0.75 1.42 
adventiste 0.08 0.27 
affairiste 0.15 0.28 
alarmiste 0.51 0.28 
altruiste 0.86 0.12 
anabaptiste 0.42 0.34 
anarchiste 4.53 6.48 


Du point de vue des fréquences, rien ne nous permet donc d'affirmer qu'une catégorie 
serait plus fondamentale que l'autre. 


5.2 Émergence des catégories en diachronie récente 


Nous avons ensuite mené une petite étude en diachronie récente afin de déterminer si 
les formes doublement catégorisées avaient un emploi préférentiel de nom ou d'adjectif 
dans leurs premiéres attestations. L'hypothése que nous avons faite est que si une forme 
en -iste posséde fondamentalement une catégorie conférée par son mode de formation 


"L'évolution diachronique de ARTISTE en fait un lexéme tout à fait à part dans la série des termes doublement 
catégorisés. 
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morphologique, l'autre catégorie devrait étre attestée plus tardivement, et son acquisi- 
tion devrait se faire progressivement. Pour le vérifier, nous avons sélectionné dans le cor- 
pus doublement catégorisé huit formes attestées aprés 1800, soit FETICHISTE (1824), GAU- 
CHISTE (1839), COMMUNISTE (1840), ABSENTEISTE (1853), PACIFISTE (1902), ROUSSEAUISTE 
(1912), CENTRISTE (1922) et FRANQUISTE (1936), pour lesquelles nous avons récupéré leurs 
cent premiers contextes d'apparition dans Frantext. 

L'analyse des contextes des huit formes étudiées nous a permis de constater que 
pour chaque forme en -iste, les deux catégories sont attestées quasiment simultanément, 
comme le montrent les exemples (22)-(24). Précisons que ces exemples sont pris dans les 
toutes premiéres attestations de ces formes relevées dans Frantext. 


(22) 


ge 


la république communiste de Platon suppose [...] (Proudhon, 1840) 


b. au fond de toutes les théories des communistes (Proudhon, 1840) 


(23) a. les penseurs théologistes, et méme fétichistes, l'appliquérent mieux (Comte 
1852) 


b. la naive situation des vrais fétichistes. (Comte, 1852) 


(24) a. soutenait un point de vue ultra-gauchiste (Queneau, 1937) 


b. sa bande de petits gauchistes (Beauvoir, 1951) 


En outre, ces formes en -iste se comportent pleinement à la fois comme des noms et 
comme des adjectifs dés les premiéres attestations. L'analyse des emplois adjectivaux 
et nominaux en diachronie ne nous permet donc pas davantage que les fréquences de 
déterminer si une catégorie est antérieure à l'autre. 


5.3 Contraintes sémantiques 


Pour finir, nous avons étudié les caractéristiques sémantiques des emplois adjectivaux 
et nominaux et celles-ci nous conduisent a considérer que la catégorie adjectivale est 
premiére et la catégorie nominale seconde. En effet, nous avons observé que l'emploi 
nominal est beaucoup plus contraint sémantiquement que l'emploi adjectival. Un adjectif 
comme FANTAISISTE, par exemple, peut s'appliquer à différents types de noms : des noms 
d'humains (une personne fantaisiste) ou d'objets abstraits (une idée fantaisiste), et méme, 
bien que plus rarement, des noms d'objets concrets (un meuble fantaisiste). Cependant il 
ne peut étre employé comme nom que pour référer à un humain. On pourra dire en effet 
un fantaisiste pour désigner un homme fantaisiste, mais on ne dira jamais, nous semble- 
t-il, un fantaisiste pour parler d'un comportement, ni une fantaisiste pour désigner une 
idée ou une théorie fantaisiste. Ce comportement n'est pas spécifique à FANTAISISTE, il 
s'observe au contraire de maniére systématique pour tous les adjectifs en -iste : on peut 
dire un {personnage/projet/batiment} futuriste, mais un futuriste ne peut désigner qu'un 
homme; les (personnes/théses] progressistes sont tous deux possibles mais les progressistes 
désigne uniquement un groupe d'humains... Cette contrainte, trés forte, justifie à nos 
yeux l'antériorité de la catégorie adjectivale et l'orientation adjectif » nom. 
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Se pose alors la question du passage d'adjectif à nom. On peut en effet se demander 
quel type de procédé permet ce changement de catégorie. La section suivante passe en 
revue les différentes analyses possibles du phénoméne avant de présenter celle que nous 
proposons. 


6 Formation des noms désadjectivaux 


6.1 Ellipse 


Une première possibilité serait de considérer que les noms en -iste issus d'adjectifs sont 
formés par ellipse, sur le modèle de l'analyse traditionnelle. C'est également le traitement 
proposé plus récemment par Borer & Roy (2010), Alexiadou & Iordáchioaia (2013) ou 
McNally & de Swart (2015) dans le cadre d'analyses plus larges concernant les noms 
désadjectivaux, qu'ils soient ou non suffixés. Selon cette analyse, un humaniste serait 
obtenu à partir de un homme humaniste par ellipse du nom homme. Une telle analyse 
pose toutefois plusieurs problémes. 

Le premier probléme est celui du nom ellipsé. Dans les cas clairement identifiés comme 
de l'ellipse, le nom ellipsé varie selon le contexte. Or, dans le cas des noms en -iste désad- 
jectivaux, seul un petit nombre de noms pourraient étre ellipsés tels que homme, femme 
ou personne. 

Se pose ensuite la question du genre du nom ellipsé. Lors de l'ellipse d'un nom dans un 
syntagme nominal, le genre du nom ellipsé est conservé et est visible sur le déterminant 
et l'adjectif, comme le montrent les exemples en (25). 


(25 a. Il y a plusieurs robes dans la vitrine. J'aime beaucoup la bleue. 


b. À animalerie, Paul a choisi une souris grise, et Marie une blanche. 


Pour les noms en -iste, l'interprétation qui semble la plus naturelle est ‘personne qui... 
Or, on ne pourrait expliquer le genre masculin de un humaniste si le nom ellipsé était 
personne. 

Enfin, l'interprétation des noms en -iste pose également probléme : ces noms dénotent 
systématiquement des humains et ne semblent pas pouvoir désigner un autre type d'ob- 
jet concret. Or, si les noms en -iste étaient obtenus par ellipse, ceux-ci devraient pouvoir 
dénoter n'importe quel type d’entité, comme dans les exemples en (25) où la bleue dé- 
signe un artefact, tandis que une blanche dénote un animé. 

Il semble donc que l'analyse par ellipse d'un nom ne permette pas d'expliquer la for- 
mation de ces noms en -iste issus d'adjectifs. 


6.2 Conversion 


Une autre possibilité est d'analyser ces noms comme des converts. En effet, la conversion 
adjectif > nom existe en francais (Corbin 1987, Kerleroux 1996) comme dans le cas des 
exemples en (26). 
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(26) CALME,>CALMEy, BLEU, >BLEUN 


Corbin (1988) analyse d'ailleurs les noms en -iste comme des convertis à partir d'ad- 
jectifs. Cette analyse se justifie dans la mesure oü les noms en -iste montrent toutes les 
propriétés des noms, comme cela a été présenté en Section 4. Toutefois, ils manifestent 
aussi des propriétés adjectivales, notamment la possibilité d'étre modifiés par un adverbe 
de degré, comme le montrent les exemples en (27) trouvés sur le Web. 


(27) a. Il reste que Letizia d'Espagne, critiquée par les très royalistes et admirée par 
les moins conventionnels, incarne la princesse moderne par excellence 


b. les trés idéalistes ne se retrouvent pas facilement ensemble et au contraire se 
trouvent souvent en plein contentieux 


c. Seuls les esprits étriqués ont jamais pensé que le réel se limitait à ce que nous 
en percevions! clament les plus idéalistes. 


d. En téte de liste, l'enseignement. Les plus alarmistes pourraient imaginer des 
professeurs purement et bonnement remplacés par des ordinateurs 


Or, un nom ne peut normalement pas étre modifié par un adverbe, sauf s'il est coercé 
par une construction prédicative (Lauwers 2014b) comme femme dans Marie fait trés 
femme maintenant, qui sera discuté dans la section suivante (exemple (29)). Cette faculté 
à étre modifiés par un adverbe montre que les noms de partisans en -iste ne sont pas des 
noms ordinaires. De ce fait, une analyse par conversion ne nous paraît pas satisfaisante 
car elle ne saurait expliquer cette faculté. En effet, un convert présente toutes les pro- 
priétés de la catégorie à laquelle il appartient, comme l'a souligné Kerleroux (1996), mais 
ne présente normalement pas les propriétés syntaxiques de sa base. C'est pourquoi nous 
présentons dans la section suivante une analyse alternative. 


6.3 Coercion 


Pour rendre compte des propriétés à la fois adjectivales et nominales des noms de par- 
tisans en -iste nous proposons une analyse par coercion. Pour cela nous présentons 
d'abord les différents types de coercion (8 6.3.1) avant de montrer en quoi l’« override 
coercion » permet de rendre compte des particularités du passage d'adjectif à nom qui 
résistaient aux analyses par ellipse ou par conversion (§ 6.3.2). Précisons que cette ana- 
lyse par coercion est similaire à celle proposée par Lauwers (2008, 2014a) pour les noms 
de propriété désadjectivaux. 


6.3.1 Différents types de coercion 


Depuis les années 1990, une abondante littérature a été consacrée à la coercion. On peut 
se reporter par exemple à (Pustejovsky 1991, Jackendoff 1997, Michaelis 2003, Francis & 
Michaelis 2003, Lauwers & Willems 2011). Comme l'ont établi (Lauwers & Willems 2011 : 
1219) « at the basis of coercion, there is a mismatch (cf. Francis & Michaelis 2003) between 


5Nous choisissons de traduire le terme override par forçage, c'est en effet le terme qui nous a semblé le mieux 
correspondre à la définition donnée ; cf. infra. 
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the semantic properties of a selector (be it a construction, a word class, a temporal or 
aspectual marker) and the inherent semantic properties of a selected element, the latter 
being not expected in that particular context. ». 

Audring & Booij (2016) distinguent trois types de coercion : la coercion par sélection, 
la coercion par enrichissement et la coercion par forçage. Les deux premiers types sont 
fondamentalement des adaptations contextuelles de traits sémantiques; la coercion par 
forcage quant à elle, qui est le type de coercion le plus fort et celui qui posséde la portée 
la plus large, est fondée sur l’« override principle » de (Michaelis 2003 : 9) : « Override 
principle. If lexical and structural meanings conflict, the semantic specifications of the 
lexical element conform to those of the grammatical structure with which that lexical 
item is combined. ». Dans la coercion par forçage en effet, c'est le contexte qui prend le 
pas sur les propriétés (sémantiques, catégorielles ou syntaxiques) de l'item coercé et lui 
impose son interprétation. 

En francais, F. Kerleroux, dés le début des années 1990 (cf. notamment Kerleroux 1991, 
1996), a proposé une analyse relativement similaire par le biais de la notion de « distor- 
sion catégorielle ». En s'appuyant sur l'opposition opérée par Milner (1989) entre terme 
et position, elle a en effet rendu compte de cas comme celui de l'exemple (28) où l'adjectif 
élégant est utilisé en position nominale. 


(28) Ilest d'un élégant! 


Pour elle, c'est l'inadéquation entre la catégorie du terme lui-méme (un adjectif) et la 
position dans laquelle il est employé (dans un syntagme nominal aprés un déterminant) 
qui rend compte du comportement et de l'interprétation particuliére de élégant dans ce 
contexte. Une telle analyse correspond aussi plus ou moins à celle que propose Lauwers 
(2014a) pour certains noms abstraits désadjectivaux. 


6.3.2 La coercion par forcage (override coercion) 


L'analyse en termes de coercion est fréquente en Grammaire de Construction pour rendre 
compte de cas comme celui sous (29) : 


(29) Marie fait trés femme 


Dans cet exemple, un nom (femme) est employé en contexte typiquement adjectival, 
c’est-à-dire un contexte prédicatif, avec modification par l'adverbe d'intensité trés (cf. 
§ 2.2). femme ne devient pas réellement un adjectif, mais son interprétation, dans un 
contexte comme celui-ci, va étre semblable à celle d'un adjectif : ce qui importe ici, ce 
sont les propriétés qui lui sont prototypiquement associées. 

Une telle analyse peut étre facilement transposées aux adjectifs en -iste employés 
comme noms. Nous faisons donc l'hypothése que ces adjectifs sont coercés en étant 
intégrés à un syntagme nominal (SN), c'est-à-dire un contexte fait pour étre saturé par 
un nom: 


(30) a. cas prototypique ` [sNcompt Dét N] —> “SN comptable 
b. coercion par forçage : [sNcompt Dét A] —> ‘SN comptable’ 
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La représentation, trés simplifiée, emprunte aux Grammaires de Construction (notam- 
ment (Booij 2010)), pour lesquelles une construction, par exemple un SN, est une associa- 
tion forme/sens : à gauche de la double fléche, entre crochets droits, figure la forme, alors 
qu'à la droite, encadré par des guillemets simples, figure le sens. En (30b), le fait de placer 
un adjectif dans une place normalement dévolue à un nom (cf. le cas prototypique illus- 
tré par (30a)) confére donc à l'ensemble une interprétation nominale, identique à celle 
qu'elle aurait si le terme était un nom. Nous avons choisi de préciser que le SN est un SN 
comptable pour justifier de la sémantique des N en -iste -ils dénotent des individus-, et 
pour justifier de la possibilité qu'ils ont d'étre précédés de tous types de déterminants 
(cf. les ex. (19)).7 

Avant de mentionner les avantages d'une telle analyse, nous voudrions revenir sur un 
point, qui concerne leur éventuelle lexicalisation. Le procédé de coercion tel que nous ve- 
nons de le décrire explique l'apparition des formes nominales issues des adjectifs corres- 
pondants. Certaines formes nominales peuvent cependant étre utilisées à une fréquence 
importante et étre consacrées par l'usage.? Il en est ainsi par exemple de communiste, 
dont la fréquence d'emploi en tant que nom (37.28 si on se base sur Lexique 3) est sensible- 
ment identique à celle qu'il a en tant qu'adjectif (36.17 dans Lexique 3). Certaines formes 
nominales sont méme devenues nettement plus fréquentes que les formes adjectivales 
correspondantes, c'est le cas de terroriste (N:19.12 vs A:8.87). De telles formes peuvent 
alors étre lexicalisées en tant que noms, et figurer à ce titre dans des dictionnaires. Op- 
timiste (A:8.4 vs N:1.84), réaliste (A:12.86 vs N:1.02), intimiste (A:0.24 vs N:0.07) ou 
fantaisiste (A:2.56 vs N:0.97) restent en revanche assez fondamentalement associées à 
la catégorie de l'adjectif, et la coercion joue sans doute encore pleinement son róle lors- 
qu'ils sont employés en tant que noms. 

Cette analyse par coercion posséde, selon nous, au moins deux grands avantages : 


(i) d'une part elle permet d'expliquer la facilité avec laquelle il est possible d'obtenir 
ces formes désadjectivales : la coercion étant un phénoméne syntaxique, cela rend 
compte du caractére systématique de leur formation ; 


(ii) d'autre part elle permet aussi, et surtout, d'expliquer pourquoi les noms désadjec- 
tivaux en -iste peuvent encore avoir des propriétés adjectivales, notamment étre 
modifiés par un adverbe de degré (par ex. les trés idéalistes en (27)) : en tant que 
formes coercées, les désadjectivaux en -iste ne sont pas pleinement des noms mais 
des adjectifs en emploi nominal. 


Cette analyse des noms issus d'adjectifs en -iste s'intégre à une analyse plus large 
de l'alternance adjectif/nom, un phénoméne présent dans l'ensemble du lexique et que 
nous avons décrit dans Amiot & Tribout (à paraitre) : n'importe quel adjectif, qu'il soit 
simple (JEUNE, GRAND), morphologiquement construit (AMBITIEUX, PARLEMENTAIRE) OU 
issu d'un participe (BLESSÉ, PERDANT) peut étre employé comme nom pour désigner un 


TPour rendre compte de la formation et des caractéristiques des noms de propriété désadjectivaux, Lauwers 
(2014a) avait quant à lui fait l'hypothése que les adjectifs étaient intégrés à des SN massifs. 

5Sur le rôle et la fonction de la fréquence, voir par exemple Bybee (2006), Bybee & Thompson (1997), Ellis 
(2002), Gries (2013). 
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humain à condition que la propriété dénotée par l'adjectif soit susceptible de caractéri- 
ser l'humain. L'ambition, par exemple, peut caractériser une personne (ex. un homme 
ambitieux) c'est pourquoi l'adjectif AMBITIEUX peut être utilisé comme nom pour référer 
à un étre humain (un ambitieux). A l'inverse, un adjectif comme ARGILEUX semble diffi- 
cilement pouvoir caractériser un étre humain et ne peut donc étre employé comme nom 
d'humain ( ??un argileux). 

Par rapport à ce cas général, la spécificité de la suffixation par -iste réside dans ses affi- 
nités particuliéres avec l'humain : en témoigne le sémantisme des dérivés nominaux, qui 
dénotent des noms de métier et de fonction sociale (par ex. DENTISTE, GARAGISTE); en 
témoigne aussi le sémantisme des dérivés adjectivaux, qui dénotent généralement des 
propriétés relatives à des comportements (ABSENTÉISTE, ALARMISTE, INDIVIDUALISTE), 
des croyances (BOUDDHISTE, CALVINISTE, JANSÉNISTE), des idéologies (MARXISTE, CAPITA- 
LISTE, PROGRESSISTE) etc., c'est-à-dire des propriétés qui sont toutes aptes à caractériser, 
directement ou indirectement, l'humain. C'est la raison pour laquelle tous les adjectifs 
en -iste sont propres à l'emploi nominal à référence humaine, contrairement à d'autres 
types de suffixations, comme -eux ou -aire, dont les dérivés ne possèdent pas tous cette 
capacité (par ex. ARGILEUX, BUDGÉTAIRE). 


7 Conclusion 


Dans cet article nous nous sommes intéressées à la suffixation en -iste puisque ce pro- 
cédé de formation de lexémes soulève des questions peu étudiées jusqu'à présent et qui 
concernent la relation entre lexémes construits de forme identique mais de catégories 
différentes, ici adjectivales et nominales. 

Nous avons montré qu'il existe deux types de suffixés en -iste : 


(i) les formes nominales auxquelles ne correspondent pas d'adjectif. Ces formes sont 
sémantiquement trés homogénes, elles dénotent des métiers ou fonctions sociales. 
Méme si l'emploi adjectival n'est pas totalement exclu pour certains de ces noms, 
celui-ci reste assez souvent ambigu (apposition ou adjectif épithète ?). 


(ii) les formes adjectivales auxquelles correspondent des noms. Celles-ci présentent 
elles aussi une grande homogénéité d'un point de vue sémantique : en tant qu'ad- 
jectifs elles renvoient à des propriétés comportementales, idéologiques, morales ou 
philosophiques; en tant que noms elles dénotent les partisans ou les pratiquants 
d'une idéologie, une philosophie, une discipline, une activité, ainsi que les habi- 
tués d'un certain comportement. Pour ces noms, nous n'avons pas repris l'analyse 
de Roché (2011), à savoir la sous-spécification entre catégories adjectivale et no- 
minale. Nous considérons quant à nous que ces noms, ayant nécessairement une 
référence humaine, sont issus de la forme adjectivale correspondante. En outre, 
nous avons montré que ces noms, qui présentent toujours des propriétés adjecti- 
vales, sont obtenus par coercion. 
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Ce traitement par coercion des noms issus d'adjectifs en -iste s'intégre à une analyse 
plus large d'un phénoméne observé dans tout le lexique, à savoir que tout adjectif est 
employable comme nom pour désigner un humain si la propriété qu'il dénote peut carac- 
tériser l'humain (Amiot & Tribout, à paraitre). Par ailleurs, les noms abstraits issus d'ad- 
jectifs homonymes tels que le beau, l'utile, l'humanitaire... ont été traités par Lauwers 
(2008, 2014a) comme des adjectifs coercés dans des emplois nominaux. Notre analyse 
des noms d'humains s'articule donc parfaitement avec celle de Lauwers et vient ainsi 
compléter la description des noms homonymes d'adjectifs en français. Enfin, il existe 
également des noms d'objets obtenus à partir d'adjectifs homonymes tels que coMMODE, 
COLLANT Ou BLEU. Ils différent toutefois des noms d'humains sur deux points : i) ils ne 
peuvent pas étre modifiés par un adverbe; ii) l'emploi nominal pour désigner des arte- 
facts n'est pas aussi systématique que pour désigner des humains. Ces noms d'artefacts 
restent à étudier afin de déterminer comment leur description s'articule avec celle que 
nous avons proposée pour les noms d'humains, ainsi qu'avec celle proposée par Lauwers 
pour les noms abstraits. 
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Chapter 5 


Les adverbes en -ment du francais : 
Lexémes ou formes d’adjectifs ? 


Georgette Dal 
Univ. Lille, CNRS, UMR 8163 - STL - Savoirs Textes Langage, F-59000 Lille, France 


Cet article cherche a déterminer le statut des adverbes en -ment du frangais : s’agit-il de 
lexémes résultant de l'application d'une régle de construction de lexémes, ou de mots-formes 
relevant du paradigme de l'adjectif ? Contrairement à d'autres langues comme l'anglais, ou, 
pour ce qui est des langues romanes, l'espagnol ou l'italien, la question a été peu débattue 
en francais dans des travaux récents, à l'exception de Dal (2007). Or, un examen attentif 
des propriétés de ces adverbes et, dans le méme temps, de la régle dont ils sont le produit 
fait clairement opter pour une analyse flexionnelle. La conclusion est par conséquent que 
les adverbes en —ment constituent des variantes contextuelles d'adjectifs, dont ils sont des 
mots-formes. 


1 Introduction 


La séquence —ment présente dans des adverbes du francais pouvant étre mis en relation 
formelle et sémantique avec un adjectif comme joyeusement / joyeux, prestement / preste 
ou timidement / timide est en général tenue pour dérivationnelle, au point qu'elle figure 
comme telle en bonne place dans les ouvrages universitaires à visée pédagogique (par 
exemple, Huot 2006, Niklas-Salminen 2015, Gardes-Tamine et al. 2015), sans parler des 
manuels ou ressources en ligne à destination de jeunes publics où, bien souvent, la for- 
mation d'adverbes en -ment constitue l'exemple archétypal de dérivation. 

Le statut dérivationnel de la règle dont -ment est l'exposant -par conséquent, le carac- 
tére lexématique des adverbes qu'elle permet de former-, n'est pas davantage remis en 
cause dans les travaux de recherche, y compris chez les morphologues (voir par exemple 

Corbin 1982, 1987, van Willigen 1983, Bonami & Boyé 2005, Roché 2010, Boyé & Plénat 
2015, Detges 2015, Rainer 2016), méme dans un cadre comme celui de la morphologie na- 
turelle dans lequel l'opposition flexion / dérivation n'est pas discréte mais scalaire (pour 
des points récents sur ce courant, cf. Dressler 2005, Luschützky 2015). Or, si l'on consi- 
dère attentivement les caractéristiques des adverbes en —ment du français, il apparait 
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que le caractére dérivationnel de la régle dont cette séquence est l'exposant n'a aucun 
caractère d'évidence. C'est ce que cherche à (re)mettre en lumière cette recherche, dans 
le prolongement de Dal (2007). 

Le présent chapitre débutera par un état de l'art sur le traitement de quelques ho- 
mologues des adverbes en -ment du français dans plusieurs langues romanes et en an- 
glais. Cet état de l'art sera l'occasion de poser quelques jalons pour la suite. Dans un 
deuxiéme temps, j'examinerai si les adverbes en —ment du francais répondent aux at- 
tendus des produits d'une régle de construction de lexémes. À l'issue de cet examen, il 
apparaitra que la réponse est négative sur tous les plans et qu'à l'instar de leurs homo- 
logues dans d'autres langues romanes et en anglais, ces adverbes peuvent étre tenus pour 
des variantes contextuelles d'adjectifs instanciant une case du paradigme des adjectifs 
auxquels ils sont morpho-sémantiquement appariables. 


2 État de l'art 


Si peu, pour ne pas dire pas, de travaux récents, excepté Dal (2007), s'interrogent sur 
la nature de la règle associant à un adjectif donné un adverbe en -ment en français 
(son statut dérivationnel est en général asserté sans discussion), celle des régles produi- 
sant des adverbes à partir d'adjectifs a fait l'objet de davantage de questionnement dans 
plusieurs langues du monde. On se concentrera ici sur la suffixation en -MENTE! dans 
plusieurs langues romanes en dehors du français et en -ly en anglais, et l'on verra que 
la question est loin d'étre résolue, méme dans les travaux les plus récents?. 


2.1 Les adverbes en -MENTE dans les langues romanes (hors francais) 


La question du statut de la séquence -MENTE des adverbes des langues romanes, en de- 
hors du français, a été abordée dans de nombreux travaux. Quatre hypothèses ont été 
formulées : l'hypothése compositionnelle (§ 2.1.1), l'hypothése dérivationnelle (§ 2.1.2), 
Vhypothése de l'affixe syntagmatique (§ 2.1.3) et l'hypothése flexionnelle (§ 2.1.4). 


2.11 L'hypothése compositionnelle 


Une hypothése récurrente est que les adverbes en -MENTE seraient des composés, partant, 
que -MENTE serait un nom conformément à son étymon latin mens, mentis (« esprit »). 
L'hypothése a été développée pour l'espagnol (c£, parmi d'autres Bello 1847, Hockett 
1958, Seco 1972, Zagona 1990, Kovacci 1999). On la trouve aussi formulée en filigrane 
pour le catalan et le portugais dans Chircu (2007). 

Outre l'argument étymologique, l'argument majeur sur lequel se fondent les partisans 
de cette hypothése est la possibilité que présente -MENTE dans certaines langues romanes 


1La notation en capitales -MENTE neutralise ici les réalisations sous les formes -mente ou —ment selon les 
langues concernées. 

?On trouvera dans Ricca (2015) une synthése trés documentée de la question pour d'autres langues du 
monde. 
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'étre élidé et mis en facteur commun en cas de coordination d'adverbes, -MENTE étant 
porté par le premier ou le dernier adverbe de la série selon les langues. La possibilité 
est attestée au moins en espagnol, catalan et portugais, comme l'indiquent les exemples 
(1-3) empruntés à la Toile : 


(1) (esp.) Inspira lenta y profundamente [lentement et profondément]. 


(2) (cat. Ràpidament i silenciosa [rapidement et silencieusement], l'Elena se'ls va 
acostar. 


(3) (port.) Quantas pessoas foram, severa e cruelmente [sévérement et cruellement], 
torturadas por se oporem ao regime? 


Certains linguistes, comme Saporta (1990), ont tiré argument de cette possibilité pour 
voir dans les adverbes en -MENTE des composés endocentriques dont la téte serait le nom 
-MENTE. 

La double accentuation des adverbes en -MENTE, une première fois sur l'adjectif re- 
pérable dans leur structure, une seconde sur la séquence -MENTE, est un autre des ar- 
guments parfois avancés en faveur de la composition (Saporta 1990, Detges 2015). C'est 
particulièrement vrai de l'espagnol (cf. 4) où, normalement, un lexéme issu d'un proces- 
sus de dérivation ne comporte qu'un seul accent, tandis que les composés permettent 
une double accentuation : 


(4) (esp.) literalmeénte ; rapidameénte ` cuidadósaménte 


Le dernier argument parfois invoqué, à vrai dire davantage contre l'hypothése dériva- 
tionnelle qu'en faveur de l'hypothése compositionnelle, est celui de la forme féminine 
de l'adjectif à laquelle s'adjoindrait -MENTE. Si ce dernier était un suffixe dérivationnel, 
il ne pourrait pas s'appliquer postérieurement à une régle flexionnelle (on reviendra ul- 
térieurement sur ce point) : or, si la séquence -MENTE n'est pas un suffixe dérivationnel, 
les adverbes en -MENTE ne peuvent étre que des composés, et -MENTE un nom, comme 
son étymon. 


2.12 L'hypothése dérivationnelle 


L'hypothèse dérivationnelle, que formulent entre autres Karlsson (1981), Bosque (1989), 
Varela Ortega (1990) ou Rainer (1996, 2016) à propos de l'adjonction de -MENTE à un 
adjectif pour former un adverbe, est en général une réponse aux faiblesses de l'hypothése 
compositionnelle. Les arguments, dont on trouve une synthése récente dans Torner 2016, 
sont en substance les suivants : 


(i) la séquence -MENTE présente dans les adverbes des langues romanes n'a plus la va- 
leur pleine du nom latin mens, mentis « esprit », et les adverbes qui en sont pourvus 
peuvent avoir des types sémantiques variés : au moins pour l'italien et l'espagnol, 
adverbes de maniére (lentamente), ou de point de vue (economicamente), adverbes 
orientés sujet (francamente), etc. ; 
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(ii) si les lexémes en -MENTE étaient des composés endocentriques a téte nominale, ils 
devraient étre des noms et non pas des adverbes (cf. aussi Fabregas 2007); 


(iii) l'adverbe en -MENTE hérite la structure argumentale de l'adjectif que l'on repére 
dans sa structure (Bosque 1989, Cifuentes Honrubia 2002), comme l'indiquent les 
exemples espagnols sous (5) empruntés à Fábregas (2007), ce que ne permet pas 
de prédire l'hypothése compositionnelle : 


(5 a. paralelo a esto / paralelamente a esto 
b. independiente de ello / independientemente de ello 


c. proporcional al resultado / proporcionalmente al resultado 


(iv) le type sémantique de l'adjectif détermine le type sémantique de l'adverbe en 
-MENTE : les adjectifs relationnels produisent des adverbes de domaine ou de point 
de vue; les adjectifs exprimant la maniére d'agir d'un agent produisent des ad- 
verbes orientés agent; etc. De la méme manière, les restrictions de sélection lexi- 
cale de l'adverbe sont corrélées à celles de l'adjectif. 


L'hypothése dérivationnelle n'entre pas en conflit avec la catégorisation adverbiale 
des séquences en -MENTE à base adjectivale (cf. l'argument (ii) ci-dessus), et est davan- 
tage en conformité avec l'héritage, du lexéme-base par le lexéme-dérivé, de propriétés 
syntaxiques (cf. iii) et sémantiques (cf. iv). Si elle ne résout pas la variété des types sé- 
mantiques d'adverbes en —ment (cf. i), du moins n'est-elle pas incompatible avec elle. 


2.13 L'hypothése de l'affixe syntagmatique 


Reprenant une notion mise au jour par Zwicky (1987), Nevis (1985) et Miller (1992) et 
principalement appliquée aux clitiques, Torner (2005, 2016) voit dans le statut d'affixe 
syntagmatique une alternative aux hypothéses compositionnelle et dérivationnelle. 

L'hypothèse de l'affixe syntagmatique se fonde sur le caractère hybride de la séquence 
-MENTE des adverbes de l'espagnol. L'argument majeur réside dans l'application de cette 
séquence à (ce qui se donne à voir comme) la forme féminine de l'adjectif, autrement 
dit à une forme flexionnelle construite en syntaxe. Or, selon l'universel 28 de Greenberg 
(1963), que réinvestit à sa manière l'hypothése de la morphologie scindée (split mor- 
phology) développée par Anderson (1977, 1982, 1992) et Perlmutter (1988), la flexion est 
réputée s'appliquer aprés la dérivation. 

Méme si, à la suite de Rainer (1996 : 87), Torner (2005 : 131) convient que ce choix d'une 
forme féminine est davantage vestigial, étant donné l'étymon de -MENTE, que requis par 
la syntaxe, il s'agit pour lui d'un argument décisif, qui explique en outre la possibilité, 
soulignée plus haut, d'une élision de la séquence en cas de coordination d'adverbes. 
Dans l’hypothèse de l'affixe syntagmatique, il n'y a en fait pas d'élision, mais plutôt 
un attachement de -MENTE à un syntagme adjectival (Torner 2005 : 132), autrement dit à 


3“If both derivation and inflection follow the root, or they both precede the root, the derivation is always 
between the root and the inflection" (Greenberg 1963 : 93). 
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une séquence syntaxique (d'oü la notion d'affixe syntagmatique), formée par conséquent 
postérieurement à l'application d'une marque flexionnelle à l'adjectif. 


2.1.4 L'hypothése flexionnelle 


L'hypothése flexionnelle semble avoir été moins explorée que les hypothèses composi- 
tionnelle et dérivationnelle pour expliquer le statut de -MENTE dans les langues romanes. 

Pour l'espagnol, on la trouve néanmoins formulée dans Hjelmslev (1928), et, à sa suite, 
dans Alarcos Llorach (1951 : 85), pour qui la « forma adverbial del adjetivo en -mente 
debe considerarse como un 'casus adverbialis', pues su morfema es exigido por el ‘verbo’ 
regente ». Pottier (1966) considére pareillement qu'en espagnol, les adverbes en -mente 
ne sont rien d'autre que la forme que revét l'adjectif sous rection verbale et, donc, que 
-mente y est une marque casuelle. 

En ce qui concerne l'italien, on peut citer Scalise (1990) et Ricca (1998, 2004), méme si, 
au terme de leur examen, ni l'un ni l'autre ne retiennent l'hypothése flexionnelle. 

Selon Scalise (1990), le principal écueil auquel elle se heurte en italien réside dans 
la productivité limitée de la suffixation en -mente, où productif est à entendre comme 
« apte à s'appliquer dés que sont réunies les conditions catégorielles favorables à l'appli- 
cation »*. En effet, là où la flexion passe pour être entièrement productive - par exemple, 
en francais, tout adjectif peut étre fléchi en nombre -, la dérivation le serait moins. Ce 
contraste figure en bonne place parmi les trés nombreux travaux s'interrogeant sur les 
critéres cherchant à opposer flexion et dérivation (cf., entre autres, Dressler 1989, Scalise 
1988, Haspelmath 1996, Blevins 2001, Kilani-Schoch & Dressler 2005, Stump 20055, ten 
Hacken 2014, Stekauer 2015). Or, s'agissant de -mente en italien, Scalise (1990) recense 
plusieurs catégories d'adjectifs qui seraient rétifs à son adjonction. Si, comme lui, l'on 
exclut le cas des possessifs, démonstratifs, indéfinis, numéraux au motif que leur statut 
adjectival est discutable, il s'agit, pour l'essentiel (le marquage par un astérisque est le 
fait de Scalise 1990) : 


(a) des adjectifs exprimant des propriétés physiques (calvo ‘chauve’ / *calvamente), 
dont les adjectifs chromatiques (giallo ‘jaune’ / *giallamente), 


(b) de l'acception littérale des adjectifs possédant une acception littérale et une accep- 
tion métaphorique (aridamente ne serait possible qu'avec l'interprétation figurée 
de arido ‘dépourvu de sentiment’), 


(c) de l'acception spatiale des adjectifs présentant une lecture spatiale et une lecture 
temporelle : ainsi, à l'adjectif lungo long (dans le temps ou dans l'espace)' corres- 
pond bien un adverbe, lungamente, mais ce dernier exprime une propriété exclusi- 
vement temporelle, 


^On distingue ici cet emploi de la notion de productivité de celui qu'en fait Schultink (1961) (en substance : 
possibilité, pour les locuteurs d'une langue, de former, de facon non intentionnelle, un nombre en principe 
infini de nouveaux mots morphologiquement complexes à l'aide d'un procédé donné). Pour un point récent 
sur la notion de productivité, cf. Gaeta & Ricca (2015) et Dal & Namer (2016). 

?Stump (2005 : 54) préfère utiliser le terme de completeness à celui de productivity. 
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(d) d'un certain nombre d'adjectifs construits : évaluatifs (leggerino ‘assez léger’ / 
"leggerinamente), adjectifs de relation en -acco (polacco ‘polonais’ / *polaccamente), 
en -ale (postale ‘postal’ / *postalmente), en -ano (isolano ‘insulaire’ / *isolanamente), 
etc., adjectifs en —bile à base verbale sous leur forme positive (utilizzabile ‘utilisa- 
ble' / *utilizzabilamente). Pour S. Scalise, 45 des 65 suffixes formant des adjectifs en 
italien bloqueraient ainsi l'application postérieure de la suffixation en -mente, sans 
qu'il ne s'agisse toutefois d'une impossibilité structurelle catégorique, comme en 
attestent naturalmente, temporaneamente, barbaricamente, amabilmente, etc., que 
cite Scalise (1990)°. 


Les adjectifs résultant d’un processus de composition seraient pareillement impropres 
à donner lieu à un adverbe en -mente en italien : *dolceamaramente, “storicocriticamente, 
etc. 

Se fondant sur ce qu'il considére comme une applicabilité limitée de la suffixation 
en -mente’, Scalise (1990) rejette par conséquent l'hypothèse flexionnelle et lui préfère 
Vhypothése dérivationnelle. 

Pour ce qui est de D. Ricca, son rejet de l'hypothése flexionnelle pour expliquer la 
suffixation en -mente en italien est moins irrémédiable. En effet, à l'issue de l'examen 
des différentes caractéristiques de cette suffixation, Ricca (1998) conclut qu'elle constitue 
un bon exemple de cas intermédiaire entre flexion et dérivation, et ce, autant d'un point 
de vue synchronique que d'un point de vue diachronique. Dans Ricca (2004), il nuance 
cette position et considére qu'au sein du systéme morphologique de l'italien, du fait des 
restrictions de natures morphologique et sémantique auxquelles elle est sujette et malgré 
sa productivité trés élevée (cf. aussi Gaeta 2008), la suffixation en -mente reléve de la 
dérivation, méme s'il ne s'agit pas là d'une dérivation prototypique (Ricca 2004 : 473). 


2.2 Les adverbes en -ly de l'anglais 


En anglais, la question du statut de la séquence -ly figurant dans des adverbes comme 
beautifully ou rapidly sous (6) a été abordée de facon récurrente : 


(6) a. She sings beautifully. 
b. The birds moved rapidly. 


Les discussions portent sur le statut dérivationnel ou flexionnel de la régle à laquelle 
est associée la séquence - ly, à l'exclusion de toute autre hypothèse. Contrairement à ce 
qu'on a vu pour -MENTE, l'hypothése compositionnelle n'est en effet pas explorée, malgré 
l'étymon nominal de - y, lic, signifiant « forme, apparence, corps » en vieil anglais (cf. 
notamment Jespersen 1954, Ricca 2015). 


On relève également sur la Toile des occurrences de ces séquences marquées comme impossibles par Scalise 
(1990). Par exemple polaccamente (litt. « polonaisement ») : « (...) e il segretario particolare di Giovanni 
Paolo, un prete polacco dal nome polaccamente impossibile ». 

7Les mémes impossibilités ont été peu ou prou signalées pour l'espagnol : cf. Egea (1993), Garcia Page (1991), 
Kovacci (1999), Fábregas (2007). 


92 


5 Les adverbes en -ment du francais : Lexémes ou formes d'adjectifs ? 


2.2.1 L'hypothèse flexionnelle 


Pour les tenants de la piste flexionnelle, que défendent entre autres Hockett (1958 : 110), 
Lyons (1968), Sugioka & Lehr (1983), Miller (1991 : 95), Haspelmath (1996 : 49-50), Baker 
(2003 : 230-235), ou, plus récemment, Giegerich (2012) et Pittner (2015), les arguments 
sont en substance les suivants? : 


— la productivité réputée trés élevée de la suffixation en - y, que s'accordent à re- 
connaitre tous les travaux qui lui sont consacrés indépendamment du statut qui 
lui est dévolu. En anglais, tout adjectif est en effet susceptible de donner lieu à un 
adverbe en -ly (cf. notamment Bybee 1985 : 84, Scalise & Guevara 2005 : 159) , 
sauf quelques cas réguliérement cités : adjectifs auxquels correspond un adverbe 
irrégulier (ex. : good / well / *goodly) et, surtout, adjectifs terminés par -ly°, en- 
core que l'exemple emblématique sillily (sur silly « béte, stupide »), réguliérement 
cité comme impossible, soit attesté dans l'Oxford English Dictionary ainsi qu'entre 
autres, dans le Wester's Online Dictionary, à cóté de, notamment, burlily, chillily, 
cleanlily, comelily, deadlily, friendlily ou ghastlily. Cette contrainte morphophono- 
logique que Stekauer (2005 : 216) impute au phénoméne de dissimilation prohibant 
la consécution de deux séquences /li/ de part et d'autre d'une frontiére construc- 
tionnelle, pour autant qu'elle soit avérée, est donc pour le moins faible; 


— le fait que le recours à un adverbe en -ly plutôt qu'au lexéme adjectival auquel il est 
apparenté est motivé par la syntaxe, l'adverbe apparaissant dans des contextes non 
nominaux, autrement dit dans des contextes impropres à accueillir un modifieur 
dont le statut adjectival est évident. Par exemple, si les verbes sing et move plus 
haut en (6) requièrent un adverbe en - y, les noms song et movement en (7) ne 
peuvent cooccurrer qu'avec un authentique adjectif : 


(7 a. She sings beautifully. / Her song is beautiful. A beautiful song. 


b. The birds moved rapidly. / Their movements are rapid. Rapid 
movements. 


— le traitement différencié que nécessiteraient les adverbes en -ly relativement aux 
adverbes dépourvus d'un affixe en anglais, si l'on traitait les premiers comme déri- 
vationnels. Selon Giegerich (2012), un tel traitement aurait pour effet de masquer 
des généralisations pouvant étre faites sur la classe des adverbes non affixés, dans 
lesquels il voit des adjectifs (on reviendra sur ce point dans le § 3.3.1); 


— le caractère mutuellement exclusif de la suffixation en -ly et de celles en -er et 
-est marquant la comparaison et le haut degré. Cette observation fait conclure 


5On peut encore citer Emonds (1976), Radford (1988), Plag (2003) ou Bassac (2004), qui sont des manuels 
ayant contribué, en tant que tels, à disséminer la thése flexionnelle. 

?Cf. Bybee (1985 : 84-85), Anderson (1992 : 195). Sur l'évitement des adverbes se terminant par la séquence 
- lily, cf. Bauer (1983, 1992, 2001). Pour un examen détaillé des adjectifs se terminant par la séquence /ly/, 
cf. Bauer et al. (2013 : chap. 15). 
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à Hockett (1958) que les formes adverbiales en -ly relèvent du méme paradigme 
que les formes adjectivales en -er et en -est, donc que, comme -er et —est, -ly est 
flexionnel (cf. aussi Giegerich 2012). S'agissant du point d'achoppement que peut 
constituer la catégorisation comme adverbes des mots en -ly relativement aux 
adjectifs auxquels ils sont liés - la flexion est en effet réputée conserver intègre 
la catégorie lexicale du lexéme sur lequel elle opére -, deux explications sont en 
concurrence parmi les partisans de la thése flexionnelle : 


— Haspelmath (1996) fait l'hypothése de l'existence de procédés flexionnels 
pouvant agir sur la catégorie lexicale des inputs (il parle de transpositional 
inflection), tels les participes présents en allemand ou les adjectifs en posi- 
tion épithéte en turc, et étend précisément cette hypothése aux adverbes en 
-ly de l'anglais (cf. aussi Bybee 1985); 


— Sugioka & Lehr (1983) mettent en question la pertinence méme de la caté- 
gorie de l'adverbe (cf. aussi entre autres Fábregas 2014), et considérent que 
ce qu'on a coutume de nommer « adverbe en -ly » n'est rien d'autre que la 
forme du paradigme de l'adjectif qui est sélectionnée dans un contexte non 
nominal. 


On reviendra plus longuement sur cette question de la catégorisation comme adverbe 
dans le § 3.3.1, lorsqu'il s'agira de déterminer le statut dérivationnel ou flexionnel de la 
règle dont -ment est l'exposant en français. 


2.2.2 L'hypothése dérivationnelle 


Pour les tenants de la piste dérivationnelle dont font partie Zwicky (1995) - en réponse 
à Sugioka & Lehr (1983) -, et Payne et al. (2010), que reprend Ricca (2015), les arguments 
sont les suivants : 
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— ilest faux qu'adjectifs et adverbes en -ly correspondants apparaissent dans des en- 


vironnements syntaxiques mutuellement exclusifs : noms pour les adjectifs, autres 
environnements pour les adverbes. C'est là l'argument central de Payne et al. 
(2010), qui considérent que, à condition d'étre postposés aux noms qu'ils modi- 
fient, les adverbes en -ly sont aptes à figurer dans la fonction de modifieurs de 
noms, comme, par exemple, globally et internationally dans les exemples sous (8), 
repris de cette étude : 


(8) a. [The unique role globally of the Australian Health Promoting Schools 
Association], as a non-government organization specifically 


established to promote the concept of the health promoting school, is 
described. 


b. The NHS and [other health organisations internationally] clearly need 
methodologies to support benefit analysis of merging healthcare 
organisations. 


5 Les adverbes en -ment du francais : Lexémes ou formes d'adjectifs ? 


— les adverbes en -ly n'héritent pas tous de la structure argumentale des adjectifs 
auxquels ils sont apparentés, comme l'indique le contraste proud of his daugther / 


*proudly of his daughter ; 


— certains types sémantiques d'adjectifs ne donnent pas lieu à la production d'un ad- 
verbe en —ly. On retrouve, mentionnées ici, les catégories d'adjectifs citées par Sca- 
lise (1990) pour l'italien : adjectifs chromatiques ; adjectifs exprimant une propriété 
sensorielle, sauf à ce que l'adverbe ait une valeur métaphorique (ex. warmly); etc. 


2.3 Discussion 


L'état de l'art qui précéde a mis en évidence au moins un point : le statut des régles mor- 
phologiques produisant des adverbes à partir d'adjectifs dans plusieurs langues romanes 
et germaniques a donné lieu à des discussions nourries, parfois virulentes, et la question 
n'est toujours pas résolue. A cet égard, on ne peut qu'étre surpris qu'en francais, peu 
de travaux se soient penchés sur le statut de la régle à laquelle ressortit la séquence 
adverbiale -ment. 

Il est par ailleurs remarquable que, dans les travaux dont il a été question dans cet 
état de l'art, l'hypothése dérivationnelle n'ait jamais été abordée positivement : soit elle 
constitue une réponse aux faiblesses de l'hypothése compositionnelle (cf. § 2.1.2), soit 
elle tempére les généralisations de ’hypothése flexionnelle (cf. § 2.1.4 et § 2.2), mais elle 
met rarement, pour ne pas dire jamais, en avant d'arguments irréfutables montrant que 
les adverbes en -MENTE ou en -ly résultent de l'application d'une règle de construction 
de lexémes, partant, que ces adverbes sont des lexémes à part entiére. 


3 Quel statut pour les adverbes en —ment du francais? 


Étant admis que l'hypothése compositionnelle est exclue en francais - l'argument de l'éli- 
sion ou de la mise en facteur commun de -ment entre plusieurs adverbes, jugé décisif par 
les partisans de cette hypothèse en espagnol, ne tient pas pour le francais moderne” -, 
l'alternative est la méme que pour -ly en anglais : flexion ou dérivation?, à moins que 
les adverbes en —ment du francais ne relévent de l'une de ces « zones grises » (Bybee 
1985), indécidables entre flexion et dérivation. 

Pour tenter d'apporter des éléments de réponse à cette question, je me propose de 
reprendre dans ce qui suit les attendus d'une régle de construction de lexémes. Aupa- 
ravant, je discuterai de la forme du radical de l'adjectif à laquelle s'attache -ment afin 
d'évacuer cette question de la discussion. 


10 Meyer-Lübke (1894 : 638) signale cette possibilité en ancien français au travers de l'exemple « Ainzi fu la 
guere maintenue Si cruel e si longuement », également cité dans Karlsson (1981 : 60). 
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3.1 Forme du radical de l'adjectif 


La séquence -ment du français est réputée s'appliquer à la forme féminine de l'adjectif 
auquel est apparenté l'adverbe (cf. entre autres Guimier 1996, Molinier & Levrier 2000 : 
28-29), autrement dit à une forme fléchie. Comme on l'a vu précédemment, ce méme 
constat effectué pour, entre autres, l'espagnol et l'italien a été porté au crédit de l'hy- 
pothése flexionnelle et de celle de l'affixe syntagmatique, dans la mesure où une règle 
dérivationnelle est supposée ne pas pouvoir s'appliquer postérieurement à une opération 
de flexion. 

Or, la notion aronovienne de morphome (Aronoff 1994), selon laquelle certaines unités 
morphologiques n'expriment aucune propriété morphosyntaxique ou sémantique - ce 
sont de pures formes, ou, selon les termes de Bonami & Boyé (2005 : 82), de « purs objets 
morphologiques » -, offre une explication élégante et neutre vis-à-vis de l'attribution 
d'un quelconque statut à la règle à laquelle est associée la séquence -ment. 

Recourant à la notion de morphome, Bonami & Boyé (2005) font l'hypothése que les 
adjectifs du francais possédent un espace thématique constitué de deux thémes, iden- 
tiques ou distincts, qui n'expriment aucune propriété morphosyntaxique, et qui servent 
à construire les cinq formes de leur paradigme : les quatre formes traditionnelles faisant 
intervenir les catégories de genre et de nombre, plus une forme de liaison du mascu- 
lin singulier en position prénominale. Le tableau 1, emprunté à Bonami & Boyé (2005), 
donne les thèmes de quelques adjectifs du français : 


Tableau 1 : Espace thématique de quelques adjectifs en francais 


Lexéme Thème 1  Théme2 
LIVIDE /livid/ /livid/ 
SEC /sek/ /sef/ 

VIF /vif/ /viv/ 
VIEUX /vje/ /vjej/ 


NOUVEAU  /nuvo/ /nuvel/ 


En flexion, le théme 1 est utilisé pour le masculin, hors liaison (arbre sec; regard vif ; 
vieux fauteuil; nouveau manteau); le thème 2 l'est pour le féminin (branche sèche; ri- 
poste vive; vieille ferme; nouvelle tenue). Pour ce qui est de la forme de liaison au mascu- 
lin singulier en position prénominale, selon les adjectifs, sont mobilisés le théme 1 (sec 
entretien : [sek@tratje]) ou le thème 2 (vieil avion : [vjejavj5]). 

Pour rendre compte de la forme du radical des adverbes en —ment, la solution, amorcée 
dans Dal (2007) et largement développée dans Boyé & Plénat (2015), consiste à ajouter un 
troisième thème à l'espace thématique de l'adjectif en français. Selon les cas, ce troisième 
théme peut étre (i) homophone du théme 2, (ii) homophone du théme 1, (iii) différent des 
thèmes 1 et 2. Majoritairement, les adverbes en -ment mettent en jeu un thème homo- 
phone du thème 2 (par exemple, /sefmà/ relativement à /sef/), autrement dit le thème 
qui sert aussi trés majoritairement à former le féminin des adjectifs. Cette observation 
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explique l'assertion récurrente selon laquelle, dans les adverbes, la séquence —ment s'ap- 
pliquerait à une forme fléchie au féminin singulier ainsi, d'ailleurs, que les hésitations 
des scripteurs lorsque les formes de masculin et féminin sont homophones sans étre ho- 
mographes : à titre d'exemple, /3olimà/, que les scripteurs orthographient joliment (9 
millions d'occurrences sur la Toile au moyen du moteur de recherche Google fin juin 
2017) ou joliement (300 000 occurrences). D'autres choix de radicaux sont toutefois pos- 
sibles, comme le montrent Boyé & Plénat (2015) : 


— lorsque le théme 1 de l'adjectif se termine par /à/, c'est à un homophone de ce 
thème que s'applique le plus souvent —ment, modulo la dénasalisation de la voyelle 
finale (cf. Pagliano 2003) : cf. méchamment préféré à méchantement (qui compte 
quand méme une petite centaine d'occurrences sur la Toile fin juin 2017), violem- 
ment préféré, en francais moderne, à violentement, sauf quand cette finale est pré- 
cédée d'une nasale ou d'une labiale, auquel cas le théme sélectionné est préféren- 
tiellement homophone du thème 2 (cf. charmantement préféré à charmamment!!) ; 


— dans d'autres cas, le théme 3 est différent des thémes 1 et 2, et se caractérise par 
l'émergence d'un /e/ concaténé au théme 2 servant à former le féminin (obtusé- 
ment)", ou est simplement imprédictible (brièvement) du point de vue synchro- 
nique. 


Le tableau 2, adapté de Boyé & Plénat (2015), récapitule ces résultats. 


Tableau 2 : Proposition d'ajout d'un théme 3 dans l'espace thématique de 
quelques adjectifs en français. 


Lexéme Thème 1  Théme2 Theme 3 
JOLI /3oli/ /30li/ [3oli/ 
SEC /sek/ /sef/ /sef/ 
CHARMANT  /fasgmá/  /fasmät/ /fasmat/ 
MECHANT /mefa/ /mefat/ /mefam/ 
OBTUS /opty/ /optyz/ /aptyse/ 
BREF /bxef/ /bgev/ /bgijev/ 


Ni l'on exclut les emplois en mention et les pages redondantes, charmantement (parfois sous la forme char- 
mentement) compte environ 160 occurrences sur la Toile au 1 octobre 2016 (ex. : « La pluie continuait 
de tomber. J'étais charmantement abritée »), contre une dizaine pour charmamment / charmament (ex. : 
« Évidemment un peu vieux jeu, charmamment démodé »). 

PL'émergence de ce /e/ n'est pas aléatoire : il apparaît, de façon récurrente, aprés une consonne nasale 
(cochonnément, communément, conformément, opportunément, uniformément...) ou aprés une fricative, le 
plus souvent sifflante, sonore (concisément, confusément, précisément...) ou sourde (densément, expressé- 
ment...), plus rarement liquide (aveuglément) ou vibrante (obscurément). S'agissant du premier cas, l'émer- 
gence de ce /e/ pourrait avoir pour objectif de satisfaire la contrainte dissimilative déjà citée. L'option prise 
ici, comme dans Boyé & Plénat (2015), est que /e/ fait partie du radical. Je renvoie à ce travail pour une 
argumentation. 
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La solution de l'ajout d'un troisiéme theme au paradigme de l'adjectif pour former 
des adverbes en —ment, résumée ici à partir de Bové & Plénat (2015), est orthogonale à 
la question du statut de la règle à laquelle est associé l'exposant —ment, dans la mesure 
oü tant les régles flexionnelles que les régles dérivationnelles peuvent sélectionner tel 
ou tel théme de l'espace thématique , de facon exclusive ou privilégiée (cf. Bonami et 
al. 2009). Elle permet par conséquent d'évacuer de la discussion la forme du radical à 
laquelle s'adjoint la forme —ment et évite de tirer argument de cette forme identifiée, à 
tort, comme étant un féminin : plus exactement, si le radical affecte le plus souvent la 
forme d'un féminin, c'est parce que la formation d'adverbes en —ment, quel qu'en soit le 
statut, et la formation du féminin de l'adjectif en francais opérent toutes deux de facon 
privilégiée sur le théme 2, ou sur un homophone de ce théme. 

On note du reste que, sans toutefois mobiliser explicitement la notion de morphome, 
ten Hacken (2014 : 19) considére pareillement que, pour concilier les données du francais 
et l'universel 28 de Greenberg, une solution est de considérer que, dans lentement, lente 
est une variante du radical de l'adjectif. Pour sa part, Ricca (2015 : 1392) recourt à la 
notion de morphome pour expliquer la voyelle /a/ qui clót le radical de certains adverbes 
en -MENTE en italien, portugais et espagnol. 


3.2 Attendus d'une Régle de Construction de Lexémes 


Une Régle de Construction de Lexémes (désormais, RCL) peut étre schématiquement 
définie comme un ensemble de régularités observables entre deux séries de lexémes dont 
les uns, les outputs, ont un degré de complexité supérieur aux autres, les inputs. 

Selon Fradin (2003), le schéma de représentation d'une RCL relevant du procédé de 
dérivation est le suivant (Tableau 3) : 


Tableau 3 : Schéma de représentation d'une RCL relevant du procédé de déri- 
vation selon Fradin (2003). 


Inputs Outputs 

Phonologie 1 Phonologie 2 
Syntactique 1 = Syntactique 2 
Sémantique 1 Sémantique 2 


Ce schéma revient à dire qu'une RCL opére sur trois plans : le plan phonologique, le 
plan syntaxique et le plan sémantique. 

De facon générale, des contraintes de différents types peuvent opérer sur les inputs 
et sur les outputs. Si l'on exclut les contraintes phonologiques qui opérent davantage au 
niveau de tel ou tel lexéme (ou ensemble de lexémes) particulier qu'au niveau de la régle 
en tant que telle, pour l'essentiel, il s'agit : 


— de contraintes sémantiques ` chaque procédé constructionnel s'applique à un type 
sémantique de bases (par exemple, bases exprimant des propriétés, référant à des 
événements, des parties naturelles, etc.), ou demande des bases qu'il sélectionne 
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qu'elles-mémes relévent (ou ne relévent pas) d'un certain type sémantique. Pa- 
reillement, le sens des outputs est une fonction du sens des inputs, cette fonction 
se caractérisant par une constante - celle, précisément, qui enregistre la contribu- 
tion sémantique de la RCL - et par une variable, représentée par le sens de l'input; 


— de contraintes syntaxiques - une RCL s'applique sur un certain type catégoriel 
de bases et forme un certain type catégoriel de dérivés -, qui peuvent étre vues 
comme une conséquence des contraintes sémantiques (cf., notamment, Dal 2004). 


D'autres contraintes peuvent jouer (contraintes historiques, pragmatiques, notamment), 
nous les laissons de cóté ici. 

S'agissant de la règle qui forme les adverbes en —ment à partir d'adjectifs en français, 
une fois la question de la forme le plus souvent féminine du radical résolue gráce au re- 
cours à la notion de morphome et l'ajout d'un troisiéme théme dans l'espace thématique 
de l'adjectif, il s'agit désormais de déterminer si les contraintes en entrée et en sortie 
dont elle s'assortit satisfont ce que demande une RCL. 


3.3 Examen 
3.3.1 Contraintes syntaxiques 
3.3.1.1 Contraintes syntaxiques d'entrée 


La règle dont -ment est l'exposant prend trés majoritairement en entrée des d'adjectifs 
(notons cette propriété P4). 

Pour donner un ordre d'idée, le corpus réuni par Pagliano (2003) compte 2746 adverbes 
dont 2725 formés à partir d'adjectifs ou de participes, soit plus de 99%. 

Le 1% restant est constitué d'adverbes figurant : 


(i) dans des séquences formulaires du type X-ment vótre ou X-ment parlant, comme 
dans les exemples sous (9) relevés sur la Toile? : 


(9) a. Internet ment vôtre; rock’n’roll’ment vôtre; jazz ment vôtre; 
meuh..ment vótre 


b. Le script est crade HTML ment parlant. 


c. Il n'est pas bizarre, marketing-ment parlant, de faire ça. 


(ii) dans des créations ludiques, comme ordinateurement ou mousquetairement sous 
(10), également empruntées à la Toile : 


(10) a. Protection contre les maladies ordinateurement transmissibles. 


b. Blafard de teint, ses cheveux aplatis, sa barbe pointue et sa moustache 
« mousquetairement » retroussée rutilent comme l'or. 


Sur la morphologie des séquences en X-ment parlant et X-ment vôtre, cf. Bové & Plénat (2015), ainsi que, 
pour ces derniéres, Mora (2007). 
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(iii) dans des adverbes désanthroponymiques, comme les exemples littéraires baudelai- 


rement ou lamartinement sous (11) (cf. Amiot & Flaux 2005) : 


(11) a. Je suis dans un jour où je vois tout idéalement et douloureusement, et 
enfin, s’il m’est possible de m’exprimer ainsi, lamartinement 
(Sainte-Beuve, Portr. Littér.) 

b. Une manière de fatalité (...) qu'à présent il nomme moins 
baudelairement le train-train de l'existence (Verlaine, Œuvres 
posthumes) 


Une hypothése est que ces séquences soient formées par analogie (cf. Dal 2003) avec des 


séquences mettant en jeu un adverbe à support adjectival" : 


il est assez probable que les formules de politesse ludiques telles celles sous (9a) 
fassent écho à des formules à support indéniablement adjectival comme cordiale- 
ment votre ou amicalement votre; 


dans les séquences en Xment parlant, les adverbes sont principalement formés a 
partir d’adjectifs relationnels (cf., relevés sur la Toile, grammaticalement parlant, 
philosophiquement parlant, financiérement parlant, culturellement parlant). S’il est 
plus difficile de trouver un chef de file pour les séquences sous (9b) que pour celles 
sous (9a), on peut néanmoins considérer que ces séquences à support adjectival 
leur aient servi de modèle; 


dans une séquence comme celle sous (10a), on ne peut pas ne pas remarquer le 
jeu échoique avec la séquence quasi-figée maladies sexuellement transmissibles (ce 
méme jeu échoique avec une séquence quasi-figée s'observe pareillement dans, par 
exemple, « Paysage ordinateurement modifié »P, également relevé sur la Toile); 


enfin, tant avec l'exemple sous (10b) qu'avec les désanthroponymiques sous (11), 
la suffixiformité adjectivale de la finale du nom support (mousquetaire, Baudelaire, 
Lamartine) est un facteur favorisant l'émergence de l'adverbe (Amiot & Flaux, 
font une remarque analogue pour les désanthroponymiques) 6. Lorsque l'anthro- 
ponyme n'a pas de finale suffixiforme, la tendance est de transiter par un adjectif 
relationnel (c'est le cas dans cet exemple relevé sur la Toile : « (...) en mettant 
moliéresquement tous les rieurs de son cóté »). 


"Dans le cadre de la grammaire de construction, une autre explication, non incompatible avec celle qui est 
proposée ici, serait que -ment sous (9)/(11) force une lecture adjectivale de litem auquel il est concaténé 
(cf. Audring & Booij 2016). 

SL’analogue est bien sûr ici génétiquement modifié. 

Dans certaines langues, la séquence finale de séquences paraphrasables par « à la manière de X », où X est 
un nom, est traitée comme un marqueur du cas essif, donc comme flexionnelle (par exemple, en hongrois 
-kent dans turistakent « à la manière d'un touriste » ; cf. Ricca 2015 : 1399). 
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En somme, ce 1% résulterait d'une pression lexicale, et serait formé par analogie avec 
des adverbes (ou des séquences comportant un adverbe) à support authentiquement ad- 
jectival. 

Relativement au statut de la règle à laquelle -ment est associé, la contrainte d'entrée 
P, - -ment s'applique massivement à des adjectifs — n'est pas décisive : si les adverbes en 
-ment sont produits par une règle dérivationnelle, cette dernière prendrait des adjectifs 
en entrée; s'ils le sont par une règle de réalisation de lexéme (par la flexion, donc), on s'at- 
tend à ce qu'ils soient des mots-formes d'une catégorie unique, qui serait en l'occurrence 
celle des adjectifs. 


3.3.1.2 Contraintes syntaxiques de sortie 


Admettons donc que les supports des mots en -ment soient des adjectifs. Il n'en reste pas 
moins que ces mots sont catégorisés comme adverbes. Appelons cette propriété P». Or, 
l'une des propriétés réguliérement invoquées pour différencier la flexion de la dérivation 
est que seules les régles dérivationnelles peuvent former des lexémes relevant d'une 
catégorie différente de celle des lexémes qu'elles prennent en entrée : on tiendrait là 
l'argument décisif en faveur du caractère dérivationnel de la règle ayant -ment pour 
exposant. 

Toutefois, on a vu plus haut que, pour Haspelmath (1996) qui suit en cela la proposi- 
tion amorcée dans Bybee (1985), la flexion peut avoir un effet sur la catégorie des sor- 
ties et que, selon lui, en anglais, la suffixation en -ly serait précisément l'une de ces 
régles flexionnelles transpositionnelles (Scalise 1988 envisage également le cas de régles 
flexionnelles dont les outputs ne reléveraient pas de la catégorie des inputs). La forma- 
tion d'adverbes en -ment du français pourrait être passible de la même explication. 

Par ailleurs, méme si l'on récuse cette possibilité, on a déjà souligné plus haut la diffi- 
culté à cerner de facon satisfaisante la catégorie de l'adverbe, qui se caractérise, pour le 
moins, par une trés grande hétérogénéité (Ricca 2015), au point que certains linguistes 
remettent en question son existence méme, parfois de facon péremptoire. C'est le cas 
d’Aronoff (1994 : 10), qui affirme : « I assume without argument that adverbs are adjec- 
tives ». 

Reprenons les principaux arguments avancés, ou pouvant l'étre, en faveur de la remise 
en cause, totale ou partielle, de la catégorie de l'adverbe. 

Pour Giegerich (2012), les arguments sont d'abord morphologiques. Pour lui, en an- 
glais, ce qu'il est convenu d'appeler « adverbes » ne présente aucune propriété mor- 
phologique qui distinguerait cette catégorie de celle des adjectifs : il en conclut que les 
adverbes sont des formes d'adjectifs. Cette « single-category claim », qui vaut tant pour 
les adverbes en -ly que pour les adverbes dépourvus de marque affixale (il fait de ces 
derniers des adjectifs non fléchis), expliquerait le fait que, contrairement aux catégories 
du nom, de l'adjectif et du verbe, la catégorie de l'adverbe ne puisse pas servir d'input 
à une quelconque régle dérivationnelle, compte tenu de l'ordre d'application dérivation, 
puis flexion". Parallèlement, l'hypothése d'une catégorie unique réunissant adjectifs et 


17 Les contre-exemples apparents qu'il reprend à Payne et al. (2010 : 63) tels soonish, soonness, seldomness, 
unseldom mettent en jeu des affixations qui, précisément, s'appliquent typiquement à des adjectifs. 
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adverbes expliquerait que si, pour le francais, l'on excepte les cas à la marge comme bau- 
delairement vus plus haut, aucun adverbe ne dérive de nom ou de verbe, là où, pour les 
catégories lexicales majeures authentiques que sont les noms, les adjectifs et les verbes, 
toutes les combinaisons sont deux à deux possibles. 

De surcroit, alors que les noms, adjectifs et verbes peuvent servir d'inputs à plus d'une 
régle dérivationnelle, dans l'hypothése de l'attribution d'un statut dérivationnel à la suf- 
fixation en —ly, l'adverbe serait atypique en ceci qu'outre la conversion d'adjectif à ad- 
verbe (on reviendra plus loin sur ce point), il ne mettrait en jeu que cette seule suffixation. 

La situation est stricto sensu transposable au français : il apparaît que la catégorie de 
l'adverbe ne sert pas d'input au systéme constructionnel du frangais et qu'en sortie, une 
seule marque, —ment, appliquée à la seule catégorie de l'adjectif, serait possible, en plus 
de la conversion. 

Comme pour l'anglais, en faisant de l'adverbe un cas d'espéce de l’adjectif et de -ment 
une marque flexionnelle, la position atypique des adverbes dans le systéme dérivationnel 
du français trouve une explication : l'adverbe ne peut pas servir d'input à une règle 
dérivationnelle, parce que c'est un mot-forme et non pas un lexéme, et il ne constitue la 
sortie que de la catégorie adjectivale, parce qu'il occupe une case du paradigme de cette 
catégorie. 

Pour Giegerich (2012), du point de vue de la flexion, l'adverbe en anglais ne présente 
pas davantage de propriétés qui le distingueraient de l'adjectif. La variation morpholo- 
gique en degré est possible pour l'adverbe, mais elle n'affecte que les adverbes dépourvus 
de -ly, et les marques flexionnelles utilisées sont précisément celles que connait égale- 
ment l'adjectif (big : bigger, biggest ; soon : sooner, soonest). Comme on l'a déjà vu, pour sa 
part, le fait que les adverbes en -ly n'acceptent pas de marquage en degré au moyen de 
marques flexionnelles s'explique dans l'hypothése flexionnelle défendue par Giegerich, 
puisque, en tant que mots-formes, ils occupent une case du paradigme de l'adjectif : les 
exposants —er, —est et -ly permettant d'instancier des mots-formes du méme paradigme, 
ils sont mutuellement exclusifs. 

Pour ce qui est du français, la situation est comparable, au moins en partie, dans la 
mesure où l'adverbe y est réputé invariable. Hummel (2013, 2014) remet en effet en cause 
l'invariabilité des « short adverbs », en méme temps que celle de l'appartenance de ces 
derniers à la catégorie de l'adverbe. Pour lui comme pour Abeillé & Godard (2004), gras 
dans manger gras ou direct dans Pierre et Marie vont direct au café ne sont pas des ad- 
verbes, mais des « adjectifs non marqués » ou « adjectifs en fonction adverbiale ». Son 
argumentation tout à la fois convoque des arguments diachroniques et exploite des don- 
nées de corpus actuelles, dans une perspective variationniste. En effet, dans les langues 
qui connaissent la flexion de l'adjectif comme le francais, une tendance observée dans la 
langue contemporaine dans des emplois non standard renoue avec celle qui a eu cours 
jusqu'au XVII siècle d'accorder les adverbes courts. Cet accord s'observe avec le sujet 
ou avec l'objet interne, comme on le voit sous (12a), relevé sur la Toile, et (12b), emprunté 
à Hummel & Gazdik (2014) : 
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(12) a. Ils jouent forts, et souvent faux, ponctuent les chansons d'exclamations en 
espagnol, sont d'une bonne humeur resplendissante et communicative. 


b. Je suis sur le point d'arréter nette ma conso de cannabis. 


L'hypothése de M. Hummel est qu'il s'agit là d'une stratégie destinée à maintenir 
la cohésion thématique au sein de la prédication avec l'un des arguments, interne ou 
externe, du verbe. On observe toutefois que cet accord est favorisé par une homophonie 
de l'adverbe court et de la forme de masculin de l'adjectif. Ainsi, si l'on reléve sur la Toile 
des exemples comme ceux sous (13) : 


(13) a. Ce que la nouvelle recherche suggère, c'est que les bénéfices de la course à 
pieds pourraient s'arréter nets plus tard dans la vie. 


b. En juin 2011, un généalogiste amateur originaire de l'Aude et résidant depuis 
quelques années dans l'Hérault a vu ses recherches piétiner pour s'arréter 
nettes. 


des requétes telles « joue(nt) forte(s) », « joue(nt) fausse(s) » raménent beaucoup 
moins de résultats utiles!?. 

Quoi qu'il en soit, l'adverbe court ne se distingue en francais par aucune marque 
flexionnelle qui lui serait exclusive : soit, dans une perspective normée de la langue, il est 
invariable ; soit, dans une perspective plus en prise avec l'usage, il recourt aux marques 
flexionnelles de genre et nombre de l’adjectif. 

Du point de vue de la syntaxe, lorsque le degré est exprimé syntaxiquement, de nou- 
veau, adjectifs et adverbes partagent les mémes marqueurs. Ce qui vaut de l'anglais - 
les deux peuvent remplacer X dans, par exemple, le comparatif « more X than », et ad- 
mettent les mémes modifieurs adverbiaux : par exemple, very expensive / very quickly; 
too big / too slowly — vaut aussi du francais. Dans les exemples attestés ci-dessous, les 
marqueurs trés, plutót, un peu, extrémement portent aussi bien sur des adjectifs (14) que 
sur des adverbes, avec ou sans -ment (15) : 


(14) a. Il faut généralement agir de facon trés stupide pour se retrouver exilé ici. 


b. Même s’il était plutôt maigre, plutôt petit et ma foi un peu ridicule, je 
pouvais imaginer que (...) 


c. Pourquoi mes muscles sont extrêmement douloureux aprés l'exercice? 


(15) 


e 


Nous nous levames trés tót, nous fümes trés rapidement habillées. 
b. L'ensemble contrastait plutót désagréablement avec le reste de la demeure. 


c. On s'est engagé un peu vite, sans évaluation suffisante des impacts sur la 
santé. 


d. J'ai été affecté extrémement douloureusement par tout cela. 


18 À titre d'exemple, en juillet 2017, « jouent fausses » ramène une trentaine de résultats utiles contre environ 
450 pour « arrêtent nettes ». 
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En conclusion, il apparait que, pas plus que P4, P2 n'est irréfutablement décisive quant 
au statut dérivationnel de la régle à laquelle ressortit l'exposant -ment : certes, les sé- 
quences en —ment sont des adverbes, mais on vient de voir que la pertinence méme de 
la catégorie de l'adverbe comme catégorie distincte de celle de l'adjectif peut étre mise 
en cause sous de nombreux aspects, et que, si l'on considére qu'en récuser l'existence 
est excessif, l’ hypothèse transpositionnelle, qui pose que la flexion peut produire des sé- 
quences ne relevant pas de la catégorie de ce sur quoi elle s'applique, affaiblit l'hypothése 
P5. 

Examinons dans ce qui suit si les contraintes sémantiques sont davantage décisives. 


3.3.2 Contraintes sémantiques 


3.3.2.1 Contraintes sémantiques d'entrée 


La règle qui forme des adverbes en -ment en français peut s'appliquer à des types sé- 
mantiques d'adverbes variés : 


— adjectifs qualificatifs exprimant une propriété : étrange / étrangement ; sale / sale- 
ment; 


— adjectifs dits relationnels, mettant en relation le référent du nom sur lequel ils sont 
construits et le référent de leur nom recteur, comme en témoignent sous (16) les 
adverbes relevés sur la Toile pouvant être mis en relation avec un adjectif en -al, 


aire, —el, -esque, - ien, -ique et if^ 


(16) a. LaFrance n'est-elle pas déjà présidentiellement rayonnante ? 
b. Il n'y a pas de frontiéres, du moins pas de frontiéres définies 
géographiquement. 
c. Sij'avais su que commander à La Redoute impliquait de se faire spammer à 


ce point, électroniquement et postalement, je dormirais encore sur mon 
matelas. 


d. Les 10 Chefs qui ont marqué mondialement l'Année gastronomique 2014. 


e. En effet, c'est un mandarin qui a vécu insulairement (un peu comme le 
francais de Québec par rapport à la France). 


f. (...) en mettant moliéresquement tous les rieurs de son cóté. 
g. (...) Ou si, rabelaisiennement nourri d'un savoir immense, (...) 


h. Un nouveau fléau guetterait les jeunes : les maladies transmises 
auditivement. 


S'agissant des adjectifs qualificatifs, il a toutefois été souligné, notamment pour l'ita- 
lien (cf. § 2.1.4) et pour l'anglais (cf. § 2.2.2), que certains types sémantiques d'adjectifs 
sont rétifs à l'adjonction d'un exposant adverbialisateur. L'observation a été faite en 


Sur la productivité des adverbes en —ment, cf. Molinier (1992). 
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particulier pour les adjectifs chromatiques et, plus généralement, pour les adjectifs ex- 
primant des propriétés physiques ou sensorielles. 

En premier lieu, pour se limiter ici aux seuls chromatiques, on remarquera qu'il ne 
s'agit pas là d'une impossibilité structurelle, comme le montrent les exemples relevés 
sur la Toile sous (7); dans lesquels, contrairement à des adjectifs lexicalisés comme 
vertement, blanchement ou noirement qu'atteste le Trésor de la Langue française, les sé- 
quences en -ment présentent bien la valeur chromatique de leur adjectif support : 


(17) a. Bientôt la machine [la guillotine] aura sans doute déclenché son couperet : la 
vie d’une vieillarde et de deux gamins se répandra rougement. 


b. Les puces de Cugnat avaient dà aller chercher ailleurs un abri et le 
charbonnier ne montrerait plus jamais le bout violettement épaté de son nez. 


c. Tout jeune, il avait trouvé sa voie : vagabonder sur les fortifications dont les 
talus, jaunement verdis de gazon brûlé par le soleil, viennent mourir près du 
viaduc. 


En second lieu, plutót que de considérer, comme Scalise (1990) ou Ricca (2015) pour 
l'italien, que l'obstacle vient d'une incompatibilité entre le sens de l'adjectif et les con- 
traintes sémantiques que fait peser sur ses inputs la régle dont —ment est l'exposant, 
je réitère l'hypothése faite dans Dal (2007) que la rareté d'adverbes en -ment à valeur 
chromatique et, plus généralement, en lien avec un adjectif exprimant une propriété phy- 
sique ou sensorielle, tient au fait que, si l'on admet que la caractéristique des adverbes 
en -ment est d'émerger dans des contextes non nominaux, dans la mesure où ce à quoi 
renvoient une phrase, un verbe, un adjectif ou un adverbe n'a pas d'extension spatiale, 
on peut difficilement lui associer des propriétés physiques ou sensorielles. En somme, 
je rejoins Fábregas (2007), qui considére que, les adjectifs de couleur ou de forme étant 
fortement associés à des entités physiques (Quine 1960), il est attendu que les adverbes 
en —ment correspondants, voués de ce fait eux aussi à exprimer des propriétés chroma- 
tiques ou physiques, trouvent peu de contextes non nominaux dans lesquels émerger. 
La contrainte ne tient donc pas à la morphologie en tant que telle, mais est purement sé- 
mantique. Elle ne diffère guère de l'impossibilité d'utiliser un adjectif chromatique avec 
un nom ne référant pas à une entité physique, en préservant la valeur chromatique ini- 
tiale de l'adjectif : le fait qu'une délibération puisse difficilement étre dite violette ou un 
exploit marron ne signifie pas pour autant que violet ou marron ne sont pas des adjectifs. 

La règle à laquelle ressortit -ment ne semble donc pas faire peser de contraintes sé- 
mantiques sur les lexémes qu'elle prend en entrée, les impossibilités, toutes relatives, 
pointées pour certains types sémantiques d'adjectifs pouvant s'expliquer sans en faire 
supporter la responsabilité à la morphologie. 


3.3.2.2 Contraintes sémantiques de sortie 


Du point de vue des sorties, il ne semble pas davantage que l'on puisse définir de fonc- 
tion sémantique qui soit commune à l'ensemble des adverbes en —ment. En effet, comme 


20Sur les adverbes à valeur chromatique, cf. Mora Millan (2005). 
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le remarquent Plag (2003 : 196) pour l'anglais et Fabregas (2007 : 6) pour l'espagnol, la 
régle à laquelle la séquence —ment est associée n'encode pas de signification lexicale par- 
ticuliére, et l'adverbe garde intégre le sens de l'adjectif auquel il correspond. Plus préci- 
sément, aux adjectifs exprimant des qualités correspondent des adverbes classiquement 
rangés parmi les adverbes de manière (18) ; aux adjectifs à sens relationnel correspondent 
des adverbes de point de vue ou de domaine (cf. (19) qui reprend (16b)) : 


(18) Il déploie joyeusement sur la toile ses émotions et ses visions avec une belle 
énergie. 


(19) Il n'y a pas de frontières, du moins pas de frontières définies géographiquement. 


Le cas des adverbes dits de phrase peut sembler démentir cette constante. 

Molinier (1990) définit les adverbes de phrase, desquels il propose une typologie?!, 
comme croisant les deux propriétés suivantes : (i) pouvoir figurer en téte de phrase né- 
gative; (ii) ne pas pouvoir étre extraits dans c'est ... que. Ainsi, dans (20), sincérement et 
étrangement sont des adverbes de phrase : 


(20) a. Sincérement, je ne pensais pas qu'un groupe pareil s’intéresserait un jour à 
moi. 


b. Étrangement, le chasseur ne semblait pas du tout géné par l'odeur. 


Certains adverbes de phrase peuvent étre homomorphes d'un adverbe de maniére. 
C'est le cas des adverbes de (20), comme le montrent les exemples relevés sur la Toile 
sous (21) : 


(21) a. Si tu t'estimes sincèrement dans ton bon droit, (...) 


b. Àl'accueil de l'hótel, la réceptionniste le regarde étrangement. 


D'autres, tel certainement, ne semblent pouvoir étre utilisés que comme adverbes de 
phrase, méme si, pour Molinier (1990), ils ont pu connaitre un emploi comme adverbes 
de manière jusqu'au XIX* siècle. 

La difficulté que posent ces adverbes relativement à l'assertion selon laquelle l'ad- 
verbe garde intégre le sens de l'adjectif auquel il correspond et, en particulier, qu'à un 
adjectif qualificatif fait écho un adverbe de maniére est qu'elle ne prédit pas l'existence 
des adverbes de phrase, ni la possibilité d'adverbes présentant un double emploi comme 
ceux sous (20) et (21). Une facon de résoudre cette difficulté est de considérer que, de 
quelque type qu'elle soit, l'opération d'ajout de la séquence —ment à un adjectif est trans- 
parente sémantiquement, mais qu'une autre opération, indépendante de la premiére, per- 
met d'employer les adverbes en -ment comme des adverbes de phrase. Pour Lamiroy & 
Charolles (2004), cette seconde opération reléve du phénoméne de pragmaticalisation, 


211] opère une première dichotomie entre adverbes conjonctifs, qui requièrent un contexte gauche 
(subséquemment, semblablement...) et adverbes disjonctifs, qui n'imposent pas cette condition. Ces der- 
niers sont à leur tour répartis entre disjonctifs de style (honnêtement, franchement), disjonctifs d'attitude — 
eux-mêmes classés en disjonctifs d'habitude, évaluatifs et modaux -, disjonctifs d'attitude orientés sujet. 
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qu'ils définissent comme le passage de la composante grammaticale à la composante 
pragmatique ou discursive du langage. 

Quoi qu'il en soit, si l'hypothése flexionnelle n'offre pas de meilleure explication à ce 
phénomène, l'hypothése dérivationnelle y achoppe tout autant. 

En bref, l'adjonction de la séquence -ment à un adjectif ne s'assortit pas d'une fonction 
sémantique repérable, qui serait dévolue à une RCL. 


3.4 La formation d'adverbes en -ment en francais : une règle 
flexionnelle 


3.4.1 Les adverbes en -ment : des formes d'adjectifs dans des contextes non 
nominaux 


Au terme de l'examen qui précéde, il apparait que la régle morphologique permettant 
de former des adverbes en -ment en français ne possède, de facon irréfutable, aucune 
des propriétés attendues d'une régle de construction de lexémes, aussi bien du point de 
vue syntaxique que du point de vue sémantique : l'existence méme de la catégorie de 
l'adverbe peut étre mise en question, et, sans aller jusqu'à nier la pertinence de cette 
catégorie, pour le moins, on pourrait étre ici face à un cas de transposition flexionnelle; 
du point de vue du systéme, tous les types d'adjectifs semblent pouvoir se voir associer 
un adverbe en —ment; sémantiquement, l'adjonction de -ment préserve le sens de lad- 
jectif, les adverbes de phrase en -ment pouvant être considérés comme constituant des 
emplois spécifiques d'adverbes de manière. 

A contrario, une fois levées les objections auxquelles elle semble achopper, la forma- 
tion d'adverbes en -ment passe avec succès l'ensemble des critères permettant de dis- 
tinguer la flexion de la dérivation qu'on peut trouver dans, entre autres, Bauer (1997), 
Dressler (2005), Stump (2005) ou Stekauer (2005) : parmi ces critéres, on retiendra ici le 
fait qu'à tout adjectif peut correspondre un adverbe en -ment sans que l'application de 
cette séquence ne s'assortisse d'une opération sémantique constante repérable. 

La conclusion qui s'impose est par conséquent que la formation d'adverbes en —ment 
reléve de la flexion, et, partant, que ces adverbes sont la forme que peuvent revétir les 
adjectifs dans des contextes non nominaux. Autrement dit, il s'agit là d'un cas d'espéce 
de flexion contextuelle, pour reprendre la terminologie de Booij (cf. entre autres 1994, 
1996 et 2000). Dans un cadre théorique différent, ce résultat rejoint ceux, anciens, de 
Kuryłowicz (1936 : 83), qui voit en -ment un a morphème syntaxique », donc une marque 
flexionnelle, et de Moignet (1963), dans la perspective de la psychomécanique. 

À l'appui de ce résultat, on peut convoquer les exemples sous (22), relevés sur la Toile 
et/ou partiellement repris de Dal (2007), que l'adverbe soit interne au domaine verbal 
ou qu'il fonctionne comme modifieur d'un adjectif ou d'un adverbe. Ainsi, le choix de 
soigneux vs soigneusement en (a/a’) est lié à la catégorie du lexéme sur lequel portent 
ces formes, selon qu'il s'agit d'un nom (a) ou d'un verbe (a’). La remarque vaut en (b/b’) 
avec réponse rapide vs répondre rapidement, en (c/c’) avec applaudissements bruyants vs 
applaudir bruyamment et en (d/d’) avec marcheur lent vs marcher lentement. En (e/e’), 
c'est le contexte adjectival qui déclenche l'émergence de l'adverbe rapidement en (e), 
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tandis qu'en (TL le déclencheur est l'adverbe vite (plus probablement adjectif si on suit 
Giegerich 20122? ) . Dans ces divers exemples, les adverbes en -ment satisfont la défini- 


tion, communément admise, qu' Anderson (1992 : 83) donne de la flexion selon laquelle 


« Inflection thus seems to be just the morphology that is accessible to and/or manipula- 
ted by rules of syntax » : 


(22) a. 


« Sac à dos » à roulettes d'une grande capacité est semi-rigide afin de 
permettre un rangement soigneux et une protection optimum. 


L'album photo 26x30 est l'outil parfait pour ranger soigneusement vos 
précieux clichés. 


Vous recevez une réponse anonyme et gratuite à vos questions. 

Plus de 16 000 collégiens et lycéens de 12 à 18 ans ont répondu 
anonymement à un questionnaire détaillé. 

Alors tout le bois résonne des applaudissements bruyants des spectateurs et 
des cris ardents des supporters. 

Il savait qu'ils ne pouvaient plus remonter, lui répondit Harry, en criant lui 
aussi pour couvrir le vacarme, mais sans cesser d'applaudir bruyamment. 

Je suis un marcheur lent qui ne cherche pas la performance mais le plaisir de 
la marche dans un cadre sublime. 

Commencez à marcher lentement, puis accélérez le pas et marchez 
rapidement pour les 5 prochaines minutes. 

Il m'avait laissée tomber pour une fille qui se prenait pour un gars et qui était 
d'une laideur abominable. 


e. Autant le dire tout de suite, c'est abominablement laid. 


Ce qui m'ennuie plutót c'est la vitesse atroce et la stabilité ... emm... trés 
« délicate » ... mais je réserve mon jugement pour plus tard ... 


. Je suis désolée d'avoir mis si longtemps à donner de mes nouvelles mais le 


temps passe atrocement vite non ?? 


On reléve bien sur la Toile quelques exemples marginaux similaires à ceux dont se 


servent Payne et al. (2010) pour récuser le fait que les adjectifs et les adverbes appa- 
raissent en distribution complémentaire, donc l’hypothèse flexionnelle en anglais (cf. 


supra, § 2.2.2). Ainsi en (23), l'adverbe émerge dans un contexte nominal et il semble 
commutable avec un adjectif : 


(23) Dans une pure tradition franco-britannique et dans la signature de cet hommage 


résolu à l'absurde du comique anglais, nous nous attaquons sans commune 


mesure à un pan entier de la culture d'une ile insulairement sans frontiére 
terrestre ni avec la Hollande... 


?? Vite a d'ailleurs été longtemps catégorisé comme adjectif en francais, cette catégorisation étant confirmée 
par le nom de propriété vitesse, ainsi que la citation suivante de Vialar, que mentionne le Trésor de la 
Langue Française (1971-1984) : « En tête, c'est Pandore : un chien vite et solide, et qui prend bien les erres 
sur la feuille ». 
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Toutefois, si tant est que, dans (23), insulairement fonctionne bien comme modifieur 
post-nominal du nom ile, il n'en demeure pas moins que, dans la grande majorité des 
cas, adjectifs et ce qu'il est convenu d'appeler adverbes en -ment figurent en distribu- 
tion complémentaire, comme le note pareillement Giegerich (2012 : 356) : les quelques 
exemples de ce type ne suffisent pas à invalider l’hypothése flexionnelle. 


3.4.2 Quelques autres propriétés 


Au moins trois propriétés remarquables des adverbes en —ment du français trouvent en 
outre une explication sous l'éclairage de l'hypothése flexionnelle : 


— la position en clôture de mot de la séquence —ment. La remarque a été faite pour 
l'italien par Ricca (1998), et, pour l'anglais, notamment par Geuder (2000) ainsi 
qu'indirectement, par l'ensemble des travaux qui listent -ly parmi les affixes de 
niveau 2, selon la généralisation de Siegel (1979)**. Or, les régles de réalisation de 
lexémes sont réputées s'appliquer postérieurement aux régles de construction de 
lexémes - cf. de nouveau l'universel 28 de Greenberg et son incarnation dans l'hy- 
pothése de la morphologie scindée -, du moins quand il s'agit de flexion contex- 
tuelle. 


Si les adverbes en -ment du français constituent la réalisation d'adjectifs dans 
un contexte non nominal, on comprend que -mentse situe en clôture de mot et, 
puisqu'il s'agit de flexion contextuelle, qu'une forme en —ment ne puisse pas servir 
d'input à une RCL. Faire de -ment l'exposant d'une RCL revient en revanche à 
entériner cette propriété sans l’expliquer ; 


— le fait que, pour ce qui est des adverbes de maniére, ils ne différent des adjectifs 
correspondants ni par leur fonction sémantique - les uns et les autres expriment 
des propriétés, d'individus et événements pour l'adjectif vs événements seulement 
pour l'adverbe?? -, ni par leur fonction pragmatique, pour reprendre les distinc- 
tions opérées par Croft (2003 : 185). En effet, les adjectifs et les adverbes de ma- 
niére en —ment correspondants assument la méme fonction pragmatique de mo- 
dification, méme s'il est probable qu'il faille faire une distinction selon que cette 
modification s'exerce sur un référent de type objet ou de type événement (Croft, 


c.p); 


— le fait que la classe des adverbes en -ment soit une classe ouverte, comme l'est, 
du reste, celle des « short adverbes », au contraire des autres sous-catégories 


230n peut aussi considérer qu'il fonctionne comme adverbe de point de vue glosable par « du point de vue 
insulaire » et portant sur le syntagme prépositionnel qui suit. 

#Selon D Affix Ordering Generalization de Siegel (1979), les affixes se répartissent en affixes de niveau 1 et 
affixes de niveau 2 : selon ce principe, trés discuté (par ex. Fabb 1988), un lexéme résultant d'une affixation 
de niveau 2 ne peut pas servir de base à une affixation de niveau 1. 

25Sans entrer dans le détail, s'agissant des adverbes orientés agents (par ex. soigneusement), l'hypothèse a 
été faite qu'ils possédent aussi un argument de type individu. La remarque vaut pour les adverbes résul- 
tatifs (par ex. confortablement), dont l'argument individu serait constitué de l'objet implicite, résultant de 
l'événement. Pour une argumentation, cf. Geuder (2000) repris en partie dans Bonami et al. (2004). 
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d'adverbes, réputées fermées. A l'échelle des langues du monde, on oppose en 
effet les catégories des noms, verbes et adjectifs, qui constituent des classes ou- 
vertes, à toutes les autres (adpositions, conjonctions, articles, etc.), qui constituent 
des classes fermées, cette partition ouvert / fermé allant de pair avec l'opposi- 
tion lexéme / gramméme (catégorie lexicale majeure / catégorie lexicale mineure ; 
content word / function word; etc. Pour une remise en cause partielle, cf. Croft 2000) 
. Or, dans les langues connaissant la catégorie de l'adverbe, toutes les sous-classes 
de la catégorie de l'adverbe sont fermées, sauf précisément celle des adverbes de 
manière (cf. pour l'anglais Haspelmath 2001 : 16544; pour le français, Fradin 2003 : 
18) . L'hypothése qui consiste à faire des adverbes de maniére et de domaine, avec 
ou sans —ment, des formes d'adjectifs a ceci d'intéressant qu'elle vide la catégorie 
de l'adverbe de sa seule sous-classe présumément ouverte, et que, dés lors, la ca- 
tégorie de l'adverbe, si on la maintient, s'homogénéise et devient clairement une 
catégorie lexicale mineure. On tient, en méme temps, une explication plausible au 
fait que le nombre des adverbes de maniére puisse s’accroitre : ils tiennent cette 
possibilité du fait que ces adverbes (avec ou sans marque affixale) instancient une 
case du paradigme des adjectifs, donc du paradigme d'une catégorie elle-méme 
ouverte. 


3.4.3 Conséquence pour l'organisation de la catégorie de l'adjectif 


Ona vu plus haut que la notion de morphome résolvait la question de la forme le plus sou- 
vent apparemment féminine du radical sur lequel -ment s'applique, à condition d'ajouter 
un troisiéme radical à l'espace thématique de l'adjectif, le plus souvent homophone du 
théme 2, auquel s'applique l'exposant —ment. 

Dans l'hypothése flexionnelle défendue ici, la conséquence est que l'adjectif connait 
deux modes de variation : l'un premier en contexte nominal, l'autre second en contexte 
non nominal, et que le paradigme de l'adjectif en francais passe de cinq à six cases : 


— encontexte nominal, l'adjectif varie en francais selon les catégories traditionnelles 
du genre et du nombre, avec, en outre, une forme spécifique dédiée à la forme 
de liaison au masculin singulier (FLMS) selon l'hypothése Bonami & Boyé (2005) 
rappelée plus haut (§ 3.1); 


— en contexte non nominal, si l'on intégre au dispositif les hypothéses d'Abeillé & 
Godard (2004) et de Hummel (2013, 2014) qui font des adverbes courts des formes 
d'adjectifs (cf. supra, § 3.3.1), deux formes seraient en compétition dans une méme 
case du paradigme : une forme longue avec —ment, une forme courte, sans —ment. 
Sur ce dernier point, en flexion, il existe en effet des cas avérés d'overabondance 
(cf. Thornton 2012), autrement dit de compétition entre plusieurs formes pour une 
méme case de paradigme. Pour le français, c'est par exemple le cas du verbe asseoir 
que citent Apothéloz & Boyé (2004) et qui posséde les trois formes [asej], [asje], 
[aswa] pour la méme structure de traits (ind, prés, 3pl}. La différence, ici, serait que 
cette compétition ne serait pas occasionnelle, mais systématique pour la catégorie 
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de l'adjectif dans son ensemble. Il resterait à explorer plus en avant la compétition 
en contexte non nominal, ce qui déborde le propos du présent article?°. 


Le tableau 4 propose une représentation du paradigme qui intégre la proposition qui 
précéde. Dans la langue standard, l'adverbe court est homomorphe de l'adjectif fléchi au 
masculin, singulier, hors liaison : 


Tableau 4 : Paradigme de l'adjectif en français 


Contexte nominal 
Singulier Pluriel 


A {masc., sg, -liaison } 


A {masc., sg, +liaison prénominale } A {masc., plur} 


Masculin 


Féminin A {fem., sg } A {fem., pl 


Contexte non nominal 


Adverbe en -ment 
Adverbe court 


4 Conclusion 


En premiére intention, dans une théorie qui prend le lexéme pour unité de base, la ré- 
ponse à la question de déterminer le statut des séquences adverbiales en -ment est a 
priori aisée à établir : si ce sont des lexémes, ce sont des produits d'une régle de construc- 
tion de lexémes formant, en tant que telle, des lexémes différents de ceux qu'elle prend 
en entrée; si ce sont des mots-formes, ils résultent d'une régle flexionnelle, servant par 
conséquent à réaliser des mots-formes des lexémes sur lesquelles elle opérent. 

S'agissant des adverbes en -ment du français, il est apparu que ce qui est cité comme 
le cas de dérivation par excellence chez de nombreux linguistes et dans de nombreux 
manuels à vocation pédagogique mérite largement discussion. A la lumiére des travaux 
menés pour d'autres langues, un faisceau d'arguments donne à penser que leurs proprié- 
tés sont davantage celles de mots-formes que de lexémes, et que « adverbe en -ment » 
est une étiquette commode pour nommer la forme que peut revétir un adjectif dans un 
contexte non nominal : l'adjonction de —ment du francais, loin de constituer une zone 
grise entre flexion et dérivation, serait ainsi pleinement une régle flexionnelle. 

Il resterait toutefois quelques points à étayer, énoncés ici sous forme de questions, 
pour que l'hypothése flexionnelle emporte définitivement l'adhésion : 


"6 Une piste à explorer, que me souffle Dany Amiot, serait une distribution complémentaire tendancielle entre 
les formes courtes, préférentiellement affectées aux adjectifs exprimant une propriété perceptible par les 
sens (parler haut / fort / bas; jouer gros / petit, etc.) et les formes longues, qui ont peu d'affinité avec ce type 
sémantique d'adjectifs. 
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— selon quel(s) critère(s) le choix entre la forme courte et la forme en -ment de l'ad- 
verbe s'effectue-t-il ? 


— existe-t-il d'autres cas avérés, en flexion, de mots-formes s'émancipant du lexéme 
au paradigme duquel ils relèvent? 
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Le présent article présente les conditions d'apparition de deux schémas morphologiques en 
créole guadeloupéen, la suffixation verbale dénominale en —é (N-é,) (ex : biké ‘se réfugier’ 
<— bik ‘refuge’ ` miganné ‘mélanger — migan ‘purée’) et la parasynthése verbale dénomi- 
nale (dé-N-é,) (ex : déchèpiyé ‘mettre en charpie’ — chépi ‘charpie’, dépyété ‘retirer les pattes 
(crabes) — pyét ‘pattes’). Nous montrons que ces shémas ont émergé via la réanalyse de 
paires morphologiques Verbe / Nom, massivement héritées du francais, langue lexificatrice, 
issues soit de conversions (bros ‘brosse’ / brosé ‘brosser’) soit de préfixations (bwa ‘bois’ 
/ débwazé ‘déboiser’). L'article défend l'hypothése que c'est notamment la spécificité des 
lexémes créoles de n'apparaitre que sous une forme unique qui a conduit à ces réanalyses : 
les verbes créoles ne variant pas flexionnellement, la finale flexionnelle francaise /e/ héritée 
est réanalysée comme suffixe dérivationnel, suivant ainsi un processus de déflexionnalisa- 
tion propre au changement linguistique. 


1 Introduction 


La réflexion menée ces cinquante derniéres années sur l'identité lexicale et la notion 
de lexéme, notamment par les morphologues, a permis d'éclairer l'analyse de dérivés 
francais impliquant des verbes. Ainsi, les verbes dénominaux, traditionnellement traités 
comme suffixés au moyen de la marque de l'infinitif (boiser, plumer, neiger) ou comme pa- 
rasynthétiques par adjonction simultanée d'un préfixe et d'un suffixe d'infinitif (embar- 
quer, désosser, décourager) ont pu étre analysés comme des convers (boiser, plumer, neiger) 
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ou des préfixés (embarquer, désosser, décourager) sur base nominale a partir du moment 
où une réflexion théorique sur l'identité du lexéme a été menée (cf. § 2). Mais une telle 
analyse de ces dérivés français est remise en cause une fois qu'ils intègrent les langues 
créoles à base frangaise, et on voit s'opérer comme un retournement de situation par rap- 
port aux analyses traditionnelles. En effet, bien que ces créoles aient hérité d'une bonne 
partie des dérivés verbaux dénominaux convers et préfixés du français, l'analyse mor- 
phologique que l'on peut en faire en créole est radicalement différente : là où les paires 
nom/verbe relèvent de conversions en français, elles sont formées au moyen d'une opéra- 
tion de suffixation en créole; et là ot les paires s'interprétent comme des préfixations en 
français, on doit y voir des parasynthéses en créole. Cette réanalyse des paires nom/verbe 
construites et héritées du francais a fait systéme en créole conduisant à la création de 
nouveaux schémas morphologiques qui sont devenus parfaitement disponibles. 

Le présent article présente les conditions d'apparition de ces deux schémas morpho- 
logiques en créole, la suffixation verbale dénominale en -é (désormais N-é,') et la para- 
synthèse verbale dénominale (désormais dé-N-é,?), en défendant l'hypothése que c'est 
notamment la spécificité des lexémes créoles de n'apparaitre que sous une forme unique 
qui a conduit à ces réanalyses (§ 3). 

L'analyse que nous présentons est pertinente pour plusieurs créoles à base francaise 
(au moins le martiniquais, le haitien et le saint-lucien), mais s'appuie uniquement sur des 
données du créole guadeloupéen. Les ressources disponibles pour la constitution d'une 
base de données de grande ampleur du lexique guadeloupéen font largement défaut, tant 
du point de vue lexicographique que numérique (cf. Villoing & Deglas 20162, § 2.). Devant 
l'absence de ressource fiable et directement exploitable, nous avons basé notre étude 
sur un corpus original établi par Maxime Deglas, locuteur natif, à partir de plusieurs 
ressources : 


(i) les dictionnaires existants du Guadeloupéen (Ludwig et al. 2012, Poullet & Telchid 
1984, Tourneux & Barbotin 1990) dont les entrées ont été filtrées gráce à des en- 
quêtes de terrain vérifiant leur attestation auprès de locuteurs natifs; 


(ii) des enquêtes de terrain réalisées auprès d’une quarantaine de locuteurs natifs issus 
de toutes les îles de la Guadeloupe; 


(iii) d'un corpus issu d'une activité de veille terminologique réalisée au sein d'ouvrages 
littéraires en langue créole, d’émissions de télévision et de chansons traditionnelles 
(cf. Villoing & Deglas 2016, $ 2. pour plus de détails). 


Le corpus ainsi constitué est composé de 7680 unités lexicales du créole guadelou- 
péen, soit une envergure équivalente à celle des dictionnaires existants. Il comprend 
1805 verbes et 4643 noms qui ont permis l’étude spécifique des relations morphologiques 


1La représentation N-é, de la structure des verbes dénominaux suffixés en -é s'interpréte comme suit : N 
représente la base nominale, -é le suffixe, et v la classe syntaxique (V pour verbe) du dérivé. 

?La représentation dé-N-é, de la structure des verbes dénominaux affixés en dé-...-é s'interpréte comme 
suit : N représente la base nominale, dé-...-é l affixe parasynthétique dont la forme phonologique comprend 
un préfixe dé- associé à un suffixe -é, et v la classe syntaxique (verbe) du dérivé. 
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Nom/Verbe dans le cadre de la suffixation verbale dénominale en -é et de la parasyn- 
thése verbale dénominale. Le corpus est enregistré sous format électronique dans une 
base de données interrogeable selon plusieurs critéres, phonologiques, sémantiques, syn- 
taxiques, qui permettent une étude fine. 

Nous menons l'étude de ce corpus en suivant une approche théorique relevant de la 
morphologie lexématique (cf. par ex. Matthews 1991, Aronoff 1994, Anderson 1992, Fradin 
2003, Booij 2010), envisageant que les unités de base de la morphologie sont les lexémes 
(et non les morphémes). Nous nous inscrivons dans une perspective qui reconnait aux 
langues créoles une morphologie dynamique (tout au moins pour ce qui concerne la 
morphologie lexicale), nous inscrivant en faux relativement aux détracteurs du contraire 
(Valdman 1978, Seuren & Wekker 1986, McWhorter 1998, par exemple). La démonstration 
commencera par une présentation des débats autour des analyses des paires Nom/Verbe 
convers et préfixés du français (§ 2.) pour ensuite développer notre hypothèse de leur 
réanalyse en créole qui a conduit à la création de nouveaux schémas morphologiques, la 
suffixation N-é, et la parasynthése dé-N-é, (8 3). 


2 Analyse des paires N/V en francais 


Les créoles à base française ont hérité une partie du lexique du français, qui est encore 
aujourd'hui largement représenté dans la langue créole (par exemple, pour le Guade- 
loupéen, 90% de mots d'origine francaise, issus principalement du francais populaire du 
17°™ siècle, mais également d'emprunts contemporains, selon Hazaél-Massieux 2002). 
Ce lexique hérité, clairement reconnaissable malgré quelques divergences phonologiques 
avec l'origine francaise, comprend des paires de lexémes morphologiquement construits 
en francais tels que (1) et (2). 


(1) bó / débòdé (‘bord’ / “déborder’) 


bwa / débwazé (‘bois’ / ‘déboiser’) 


figi / défigiré (‘figure’ / ‘défigurer’) 


TOP 


Ro 


fowm / défówmé (‘forme’ / ‘déformer’) 


kras / dékrasé (‘crasse’ / ‘décrasser’) 


m e 


rasin / dérasiné (‘racine’ / ‘déraciner’) 
(2) adisyon / adisyonné (‘addition’ / ‘additionner’) 
bav / bavé (‘bave / ‘baver’) 


bros / brosé (‘brosse’ / ‘brosser’) 


7 P 


divòs / divòsé (‘divorce / ‘divorcer’) 
fèt / fété (‘fête’ / fêter’) 


savon / savonné (‘savon’ / ‘savonner’) 


mo Dn 


Ces paires Nom/Verbe héritées sont prises dans une relation morphologique en fran- 
cais que l’on ne peut plus leur reconnaitre en créole. Les paragraphes qui suivent donnent 
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un rapide apercu des analyses morphologiques auxquelles elles répondent en frangais, 
pour présenter, ensuite, l'analyse morphologique que nous en proposons en créole gua- 
deloupéen. 


2.1 Les paires du type bois / déboiser 


La formation en francais des verbes de (1) a été l'objet de grandes discussions. Une tra- 
dition qui remonte au 19*™® siècle les a analysés comme des construits morphologiques 
par parasynthése, c'est-à-dire comme relevant d'une construction morphologique où une 
base est simultanément préfixée et suffixée. Cette analyse remonte au moins à Arséne 
Darmesteter. 


« Cette sorte de composition? est trés riche : les verbes qu'elle forme, et que l'on 
désigne sous le nom de parasynthétiques, offrent ce remarquable caractére d'étre 
le résultat d'une composition et d'une dérivation agissant ensemble sur un méme 
radical, de telle sorte que l'une ou l'autre ne peut étre supprimée sans amener la 
perte du mot. C'est ainsi que de barque l'on fait em-barqu-er, dé-barqu-er, deux 
compositions absolument uns et dans lesquelles on ne retrouve ni des composés 
débarque, embarque, ni un dérivé barquer, mais le radical barque. » Darmesteter 
(1894 : 24) 


L'analyse est largement reprise au 20°™® siècle par Nyrop (1936 : 215), et a rencon- 
tré encore beaucoup de succés à partir des années 70 dans d'autres théories, comme 
la Grammaire Générative Transformationnelle (Dubois 1962, Guilbert 1975, Zribi-Hertz 
1972, Scalise 1994) ou encore dans le cadre lexicaliste (Booij 1977). Elle s'est également 
étendue aux grammaires traditionnelles (Grevisse & Goose 1988 : 253) et scolaires en 
France (cf. par exemple, Chevalier et al. 1964 : 54, Béchade 1992 : 119), voire aux manuels 
de morphologie du francais (Gardes-Tamine 1988 : 65, Apothéloz 2002 : 91, Huot 2006 
: 121-122) . Malgré sa popularité, l'analyse parasynthétique est remise en cause pour ces 
verbes par Dell (1970 : 201-202) puis plus largement par Corbin (1987 : 121-139), et à leur 
suite Fradin (2003 : 288-307). La critique s'appuie unanimement sur l'erreur d'analyse ré- 
currente qui est faite de la forme du verbe prise métalinguistiquement : l'affixe d'infinitif 
(qui apparait de facon conventionnelle dans la forme de citation du verbe) est assimilé à 
un suffixe dérivationnel. Cette erreur provient en partie d'une confusion entre la langue 
et la métalangue (Corbin 1987 : 124) et en partie de ce que les cadres théoriques ne dé- 
finissent pas théoriquement l'individu lexical. Une double confusion est ainsi à l’œuvre 
(Kerleroux 2000) : une premiére confusion entre la forme de citation métalinguistique du 
verbe (qui est traditionnellement l’infinitif en français) et sa forme phonologique, et une 
seconde confusion entre la forme phonologique du verbe avec l'individu lexical. Ainsi, 


« le rapport catégoriel N>V va étre vu comme une suffixation, puisque la forme 
d'infinitif (dans son róle citationnel) est prise pour le verbe lui-méme, et que l'infi- 
nitif francais présente un suffixe (à la différence de l'anglais). [...] Tout le probléme 


3Darmesteter parle de composition pour caractériser la préfixation, témoignant par-là du fait que certains 
préfixes sont issus de prépositions latines. 
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est que cela implique de voir dans le suffixe flexionnel d'infinitif un suffixe qui soit 
également dérivationnel... » (Kerleroux 2000 : 9) 


Or il a été clairement démontré que l'affixe d'infinitif ne peut étre identifiable à un 
suffixe dérivationnel, comme le prouve le fait qu'il n'apparaisse jamais en dérivation, où 
seul le radical sert toujours de base (Corbin 1987 : 129, Lyons 1977 : 19, Fradin 2003 : 93, 
Fradin et al. 2009 : 9, par exemple). Ainsi, il aura fallu plus d'un siécle pour montrer que 
le suffixe d'infinitif de la forme citationnelle n'appartient pas au lexéme, en tant qu'unité 
lexicale. 

Il résulte de cette remise en cause une nouvelle analyse selon laquelle « les pseudo- 
parasynthéses verbales ne sont en fait que des préfixations » (Corbin 1987 : 129) : la 
base est nominale et le dérivé verbal. Ainsi, selon cette perspective, les données de (1) 
sont-elles analysées, en francais, comme des verbes préfixés sur bases nominales dont la 
structure correspond à (3) : 


(3) [dé- [NII 


Ces préfixes dénominaux verbalisateurs présentent, selon Corbin, une propriété origi- 
nale au regard de la majorité d'entre eux, ils entrainent un changement de catégorie de la 
base, au méme titre que la plupart des suffixes. Cette propriété des préfixes n'ayant pas 
été reconnue par toute une tradition, a également, selon Corbin, largement contribué à 
l'analyse en terme de parasynthése. 

Les paires morphologiques Nom/Verbe en (2) ci-dessus ont subi une erreur d'analyse 
du méme type. 


2.2 Les paires du type brosse/brosser 


La formation des verbes du francais en (2) a également fait l'objet de grandes discus- 
sions. L'analyse de ces paires s'est heurtée, dans la littérature sur la morphologie du 
francais, aux mémes blocages que les verbes dénominaux préfixés : le suffixe d'infinitif 
de la forme citationnelle du verbe a été interprété par toute une tradition comme un 
suffixe dérivationnel. 


C'est cette méme prétendue suffixation qui apparait dans la formation de verbes 
dénominaux non préfixés comme clouer, ou dans les déadjectivaux comme brunir, 
rougir. (Dell 1970 : 200-202) 


Selon l'orientation de l'opération morphologique (de nom à verbe ou de verbe à nom), 
la disparition (orientation V — N) ou l'apparition du suffixe (orientation N — V) a été 
vue comme relevant de deux mécanismes différents, 


— Ja « dérivation régressive » (terminologie que l'on retrouve chez Nyrop (1936), dans 
les grammaires traditionnelles (Grevisse & Goose 1988) et certains manuels de mor- 
phologie (Gardes-Tamine 1988)), rend compte d'une apocope du suffixe d'infinitif, 
permettant de former un nom à partir d'un verbe (par exemple voler — vol); 
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— unmécanisme de suffixation de l'infinitif permettant à un nom de devenir un verbe 
(plante — planter). Cependant, ce rapport entre nom et verbe n'est pas clairement 
reconnu par les premiers grammairiens comme relevant de la morphologie comme 
l'atteste le flou dans lequel il est traité par exemple par Nyrop (1936), Meyer-Lübke 
(1894) et plus tard par les grammaires traditionnelles (cf. par exemple Grevisse & 
Goose 1988 : 238). 


Là encore, le défaut de ces analyses est l'absence de questionnement théorique quant 
à l'identité du lexéme, confondant forme citationnelle et unité lexicale. Les approches 
plus contemporaines répondent à ces analyses erronées en voyant dans les paires en (2) 
des construits ressortissant à une opération de conversion de nom à verbe ou de verbe 
à nom (cf. pour le français, Corbin 1987, 2004, Mel'éuk 1996, Kerleroux 2000, Fradin 
2003, Namer 2009, Tribout 2010). L'apparente différence phonologique entre le nom et 
le verbe n'est liée qu'à la convention que l'on adopte en français de citer les verbes au 
moyen de leur forme d'infinitif et les noms à partir de leur forme de singulier. Mais les 
formes phonologiques des lexémes bases et dérivés (en d'autres termes, leurs radicaux), 
sont bien en tous points identiques, ce qui autorise à reconnaître entre eux une relation 
morphologique de conversion. 

Ainsi, les paires en (2) sont-elles analysables soit selon la structure (4a), soit selon la 
structure (4b), sans qu'aucune sorte d'affixe ne soit en jeu : 


(4) a. 
b. [V]y 


3 Analyses des paires N/V en créole 


Les données en (1) et (2) formées par préfixation ou conversion verbale dénominale en 
francais et héritées, ne peuvent pourtant pas recevoir la méme analyse en créole. Dans 
les paragraphes qui suivent, nous argumentons en faveur de la double hypothése qu'en 
créole, 


(i) la relation morphologique entre les noms et les verbes en —é de (2) correspond à 
une suffixation verbale sur base nominale (N-é,) et non à une conversion comme 
en francais; 


(ii) larelation morphologique entre les noms et les verbes en dé-N-é, de (1) correspond 
à une parasynthèse plutôt qu'à une préfixation comme c'est le cas en français. 


Ces résultats nous aménent à conclure que ces paires morphologiques Nom/Verbe ont 
subi une réanalyse du français au créole*, réanalyse due en grande partie à la spécificité 
des lexémes créoles de n'apparaitre que sous une unique forme. C'est sur cette spécificité 
des verbes en créole guadeloupéen que s'ouvre le § 3.1. 


^Nous entendons “réanalyse” au sens général de Langacker (1977 : 58), à savoir un changement dans la 
structure (morphologique) d'un lexéme qui n'implique pas pour autant de modification dans sa forme 
phonologique de surface. Voir aussi le recours qu'en fait DeGraff (2001 : 67-68). 
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3.1 Les verbes du créole guadeloupéen 
3.1.1 Morphologie 


Les verbes du créole guadeloupéen, comme toutes les autres unités lexicales, ne pré- 
sentent pas de morphologie flexionnelle, ce que la littérature pointe en évoquant soit 
l'absence de flexion dans les langues créoles, soit une morphologie pauvre, voire inexis- 
tante. Les propriétés liées au Temps-Aspect-Mode sont prises en charge par des par- 
ticules qui précédent le verbe, comme on l'observe en général dans les créoles à base 
francaise (cf. Valdman 1978, Bernabé 1987, Mufwene & Djikhoff 1989, Hazaél-Massieux 
2002 : 71; voir aussi Germain 1976 : 109—134, pour le guadeloupéen) . 

Lorsque les verbes sont hérités du français, une seule forme du verbe est conservée en 
créole. Il s'agit, a priori, soit de la forme de l'infinitif soit de la forme du participe passé, 
soit d'une de celles du présent indicatif ou impératif (Germain 1976 : 110). Pour les verbes 
du 1% groupe et 2°" groupe, l'origine de la forme héritée n'est pas décidable puisque 
les formes du participe passé et de l'infinitif sont homonymes à l'oral avec une finale en 


— /e/ pour les verbes du 1** groupe, 


— /i/ pour les verbes du 2°" groupe (sachant qu'au 17° siècle, époque où la majo- 
rité du lexique français est hérité, le /r/ final des infinitifs en -ir ne se prononçait 
plus avant d'étre réhabilité ultérieurement sous l'influence des grammairiens et 
des poètes). 


La table 1 présente les différentes finales verbales des verbes créoles hérités des verbes 
français et les formes fléchies supposées originelles. 

Chacune des finales n’a pas la même représentativité au sein du lexique guadeloupéen, 
et on note une très large majorité de verbes à finale en -é (toute origine confondue, 
hérités, construits en créole ou autre, cf. Table 2)°. Nous supposons que cette trés forte 
proportion est liée à un héritage massif de verbes francais à finale en -é, héritage qui 
aurait eu un impact important dans la morphologie du créole (cf. § 3.1.2. ci-dessous). 


3.1.2 Verbes hérités versus verbes créoles 


La discrimination, au sein du lexique créole, entre verbes hérités et verbes créoles -ou 
« indigènes », pour reprendre la terminologie de Lefebvre (2003) et Brousseau (2011)- 
suscite discussion, dans la mesure où rares sont les cas où l'héritage est total. En effet, 
les verbes, en passant du francais au créole, peuvent avoir subi des modifications pho- 
nologiques, sémantiques ou syntaxiques. Une position consiste à considérer comme non 
francais tout lexéme hérité ayant subi une variation en créole : par exemple, pour Brous- 
seau (2011 : 68), les lexémes pitiab ‘pitoyable’ et lonvi ‘longues-vues’ en Saint-Lucien, 
sont considérés comme des bases inexistantes en francais à cause de l'écart phonolo- 
gique entre les deux langues, et kouvé ‘couvrir’ à cause de la différence sémantique avec 


?Dans la table 2, la classe « autres » inclut principalement des verbes à finale consonantique dont une bonne 
part sont construits par composition d'un verbe et d'un nom (bat chat 'battre en retraite’, péd lakat ‘perdre 
la téte). 
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Tableau 1: Verbes hérités du francais. 


Groupe Finale Finale Issus de Issus de Issus de formes Glose 

en créole en formes participes ` d'indicatif/ 

francais francais d'infinitifs passés impératif présent 

1 -é /e/ karésé ‘caresser’ 
brosé ‘brosser’ 
gomé ‘gommer’ 
blagé ‘blaguer’ 

2 -i /ir/~/i/ asorti ‘assortir’ 
chwazi ‘choisir’ 
fléri ‘fleurir’ 
nwasi ‘noircir’ 

3 -è /er/ dékouvé ‘découvrir’ 
ofè ‘offrir’ 
wouvè ‘ouvrir’ 
soufè ‘souffrir’ 

/e/~/er/ fè fè fè ‘faire’ 
plé plé ‘plaire’ 

-an /a/~/adr/ aprann aprann ‘apprendre’ 
défann défann ‘prendre la défense’ 
étann étann ‘étendre’ 
fann fann ‘fendre’ 

-wè /wa/~/war/ bwè bwè ‘boire’ 
kwè kwè ‘croire’ 
pèsivwè pèsivwè ‘apercevoir’ 
wousouvwè Wousouvweé ‘recevoir’ 

-i /ir/~/i/ bouyi ‘bouillir’ 
fui fui ‘fuir’ 
manti ‘mentir’ 
rèdi rèdi ‘redire’ 
sòti ‘sortir’ 

-enn /€/~/ &dr/ détenn détenn détenn ‘déteindre’ 
étenn étenn étenn ‘éteindre’ 

krenn krenn ‘craindre’ 
soutyenn ‘soutenir’ 
tenn tenn tenn ‘teindre’ 

-èt /etr/ admét ‘admettre’ 
disparèt ‘disparaître’ 
pwomèt ‘promettre’ 
rèkonnèt ‘reconnaître’ 

/et/ défèt ‘défaire’ 
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Tableau 2 : Proportion des verbes guadeloupéens selon leur finale. 


Verbes à finale en Total 
nbr %des V totaux 


-é 1451 80% 
-i 147 8,1% 
-é 25 1,4% 
-ann 30 1,6% 
-we 10 0,5% 
-enn 11 0,6% 
-ét 15 0,8% 
-an 10 0,5% 
-0 8 0,4% 
autres 98 5,4% 


Total 1805 100% 


le verbe couver. Nous nous distinguerons de cette position en considérant comme hérité 
du français tout verbe dont l'origine française est reconnaissable, phonologiquement et 
sémantiquement, malgré les modifications subies en créole. Ainsi, parmi les exemples de 
Brousseau, seul kouvé ‘couvrir’ ne serait pas reconnu comme d'origine francaise à cause 
du sens trop éloigné du verbe couver du francais. Notre choix repose sur le fait (i) d'une 
part qu'il est extrémement difficile de connaitre précisément la phonologie et la séman- 
tique des lexémes hérités d'un état ancien ou régional du français, et en conséquence, de 
déterminer, avec certitude, l'écart entre le supposé verbe francais et son correspondant 
hérité en créole; (ii) d'autre part que quasiment tout lexéme hérité du francais a subi 
une modification phonologique voire sémantique, méme mineure, et qu'il serait difficile 
d'établir des critéres départageant les lexémes suffisamment altérés pour étre classés 
créoles et les autres. 

Afin de déterminer l'origine francaise d'un lexéme créole, nous nous sommes appuyés 
sur leur attestation en entrée d'un dictionnaire de français, tout dictionnaire, registre 
de langue et variétés dialectales confondus (voir aussi Brousseau 2011 : 68 sur l'utilité 
des dictionnaires du offe au 20?* siècle). La recherche est largement facilitée par la 
Toile qui met à notre disposition plusieurs types de dictionnaires du francais, permettant 
notamment de retrouver des verbes aujourd'hui perdus mais relevant d'un état de langue 
ancien ou d'un dialecte du francais, dont on suppose qu'ils constituent le fond du lexique 
créole (cf. par exemple Thibault 2012 : 12). 

Ces critéres nous permettent de distinguer les verbes hérités de deux autres types de 
verbes : 


(i) les verbes morphologiquement construits en créole, tout procédé morphologique 


et toutes bases confondues (bases non héritées (5), bases héritées (6), bases héritées 
avec changement phonologique (7) ou sémantique (8)). 
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(5) a. bik — — biké 


refuge se réfugier 
b. fifin  — fifiné 
‘bruine’ ‘bruiner’ 
c. migan — miganné 
‘purée’ ‘mélange’ ‘mélanger’ 
d. plich — pliché 
‘correction’ ‘donner une correction’ 


e. vonvon — vonvonné 
‘bourdon’ ‘bourdonner’ 


(6) a. balkon — balkonné 


‘balcon’ 'étre au balcon’ 
b. garé — dégaré 
‘garer, stationner” ‘sortir de la place de garage, de stationnement” 
c. lang — langé 
‘langue’ ‘embrasser’ 
d. pyé | — dépyété 
‘pattes’ ‘retirer les pattes (crabe)’ 
e. tik — détiké 
‘tique’ ‘retirer les tiques’ 


(7) a. fouch — fouchté 
‘fourche’ ‘bêcher’ 
b. katyé — dékatyé 
‘morceau’ 'couper en quartier’ 
c. nwel — nwélé 
noël ‘fêter Noël 
d. pengné — dépengné 
peigner défaire une coiffure 
e. ves —> vèsté 
‘veste’ ‘mettre sa veste’ 
(8) a. kabann — kabanné 


‘lit’ ‘trainer au lit’ 


b. kaz — dékazé 
maison déplacer une maison à l'aide 
d'un véhicule pour l'installer ailleurs' 


c. loup — loupé 
‘boursouflure’ 'enfler' 
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d. parad | — paradé 
'étalage' ‘parader’ 


(ii) les verbes ne répondant à aucun de ces critères, ni hérités du français, ni construits 
en créole, et dont l'origine peut être connue (cf. un emprunt à l'anglais, à l’espa- 
gnol, aux langues africaines ou autre), ou non. 


(9) a. bénékaki 

‘hésiter’ 

b. griji 
‘s’égratigner” 

c. kôviyé 
‘tordre’ 

d. lolé 
‘remuer’ 

e. tòtòy 
‘agacer’ 


Sur la base de cette répartition tripartite des verbes en créole (verbe hérité, verbe 
construit en créole, verbe autre), nous obtenons les proportions suivantes (cf. Table 3 
qui ne représente que les trois finales les plus représentées, les finales verbales en -é, - i, 
et -ann). 


Tableau 3 : Proportion de verbes hérités, construits ou autres selon leur finale. 


Verbes en créole guadeloupéen 


Verbes à Total Verbes hérités Verbes construits Autres 
finale en en créole 
nbr % des V nbr % des V nbr % des V nbr % des V 
totaux à finale à finale à finale 
en... en... en... 
-é 1451 80% 1230 84% 153 10,57 66 4,5% 
-i 147 8,1% 122 83 % 17 11,5 % 8 5,5% 
-ann 30 1,6% 27 90% 3 10% 0 0% 
Total 1805 100% 1468 81% des 248 14% des 86 5% des V 
V V 


Notre corpus comprend ainsi une part majeure de verbes hérités du français : sur les 
1805 verbes listés, 1468 sont hérités, soit 81 % des verbes du créole. Parmi ces verbes 
hérités, la majorité sont des verbes à finale en —é (soit 84 %). Loin derrière se trouvent les 
verbes hérités à finale en -i qui ne représentent que 8,3% des verbes hérités (122 verbes 
hérités à finale en -i parmi 1468 verbes hérités). Les verbes présentant d'autres finales 
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(-ann, —é, -wè etc.) sont encore moins nombreux et trés peu représentés. Cet ordre de 
préférence se refléte largement dans les verbes construits en créole : là encore, les verbes 
à finale en -é sont les plus représentés (61,5% correspondant à 153 verbes construits en -é 
relativement à 248 verbes construits), suivis de loin par les verbes à finale en - i (moins de 
7%). Les autres verbes restent de l'ordre de l'épiphénoméne. Ce paralléle entre finale des 
verbes hérités et verbes construits en créole conduit raisonnablement à faire l'hypothése 
que le lexique hérité a fortement pesé sur la formation morphologique des verbes créoles. 
Ainsi, dans la mesure ot la majorité des verbes hérités sont ceux à finale en -é et que les 
verbes créoles dérivés sur base nominale présentent également majoritairement cette 
finale, nous émettons l'hypothése que la finale flexionnelle en -é des verbes hérités a 
été réanalysée, dans certaines circonstances, comme un suffixe dérivationnel en créole. 
Le paragraphe 3.2. présente des hypothéses sur les conditions de cette réanalyse. Nous 
n'examinerons pas plus avant ici la possible réanalyse des finales de verbes hérités en 
-i, mais remarquons néanmoins qu'en dépit de la trés faible proportion de ces verbes 
dans le lexique créole (8,1%), la part des verbes construits en —i est proportionnellement 
équivalente à celles des verbes construits en -é (11,5 % contre 10,5 % pour les verbes en -é), 
ce qui conduirait à rendre crédible l'hypothése de la création d'un suffixe verbalisateur 
-i en créole guadeloupéen. 


3.2 Réanalyses des paires N/V de convers comme suffixations 


Selon notre hypothèse, la réanalyse des verbes à finale en -é du français en créole n'a été 
possible que dans le contexte lexical créole où ces verbes français sont hérités avec les 
noms francais en relation de conversion avec eux, soit une conversion de nom à verbe 
(N— V) soit une conversion de verbe à nom (V— N) (cf. (10)). Ainsi, le lexique du 
créole guadeloupéen comprend des paires de convers Nom/Verbe héritées du français, 
pour lesquels l'analyse en terme de conversion n'est pas valide en créole. 


3.2.1 De la conversion en francais à la suffixation en créole 


La raison principale qu'une relation de suffixation soit perçue en créole entre ces paires 
Nom/Verbe tient au fait que le -é final du verbe apparait comme du matériel phonolo- 
gique supplémentaire par rapport à la forme phonologique du nom base (10). Y voir une 
conversion de nom à verbe serait alors contraire à la notion de conversion puisque les 
radicaux ici se différencient phonologiquement. 


(10) a. adisyon /adisyonné 
‘addition’ ‘additionner’ 


b. bav /bavé 


‘bave’ ‘baver’ 


c. bros / brosé 
‘brosse’ ‘brosser’ 


d. divès / divósé 
€ D 3 € DH H 
divorce’ ‘divorcer 
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e. fèt / fété 
‘fête ‘fêter’ 

f mank / manké 
‘manque ‘manquer’ 


g. savon /savonné 
‘savon’ ‘savonner’ 


Comme les verbes créoles n'ont qu'une forme, les verbes en (10) ne présentent donc 
que la forme comprenant un -é final. Ce —é final, de fait, appartient bien au verbe en tant 
qu'unité lexicale et n'est pas le marqueur du mode infinitif apparaissant dans la forme ci- 
tationnelle du verbe francais. Ainsi, les paires Nom/Verbe en (10) héritées du francais ne 
peuvent subir la méme analyse en francais et en créole. Elle se distinguent des paires de 
Nom/Verbe en (11) qui, au contraire, entretiennent bien une relation morphologique de 
conversion en créole (de type N— V ou V— N). En effet, en créole, comme dans toutes 
les autres langues, les noms et les verbes en relation de conversion sont phonologique- 
ment en tous points identiques (cf. en (11a) des paires de convers Nom/Verbe à finale en 
-é et en (11b) des paires de convers Nom/Verbe présentant une autre finale vocalique). 


(11) a. i balyé, / balyé, 
‘bala? ‘balayer’ 


ii. chanté, / chanté, 
‘chanson? ‘chanter’ 
iii. goumé, / goumé, 
‘combat’ ‘se battre’ 
iv. lélé, / lélé, 
‘touillette’ 'touiller' 
v. manjéy / manjé, 
‘repas, mets’ ‘manger’ 
vi. tété, / tété, 
sein téter 
b. i. anviy / anviy 
'envie' ‘avoir envie’ 
ii. bobi, / bobiy 
‘assoupissement’ ‘somnoler’ 
iii. kaka, / kakay 
excrément' 'déféquer 
iv. mo, / mo, 
"mort ‘mourir 


v. travayy / travayy 
‘travail’ ‘travailler’ 
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Par ailleurs, on ne peut, en aucun cas, tenir l'hypothése de la conversion en traitant 
le —é final des verbes en (10) comme une marque spécifiquement verbale : 


(a) soit une marque de classe flexionnelle (une voyelle thématique). 


(b) soit une marque identifiant la catégorie verbe. 


En effet, aucune des deux hypothèses ne tient : l'hypothése (a) d'une voyelle théma- 
tique tombe car le créole n'a pas de systéme flexionnel pour les verbes, et il n'y aurait 
aucune pertinence à exploiter une voyelle thématique; et l'hypothése (b) tombe aussi 
parce que les finales vocaliques des verbes sont variées (finale en /i/, /e/, /we/ présentées 
ci-dessus Table 1), auxquelles on peut ajouter celles en /o/, /j/, /6/ en (12), et on peut diffi- 
cilement imaginer que la langue dispose d'autant de marqueurs verbaux, en particulier 
parce que les noms, aussi, présentent des finales vocaliques en /e/, qu'ils soient ou non 
hérités (cf. (13a) pour les noms hérités, et (13b) pour les noms créoles) : 


(12) a. bo, mo, cho, fé ko 
‘embrasser’, ‘être mort’, ‘avoir chaud’, ‘unifier’ 


b. totoy, kay 
agacer’, ‘aller 

c. mawon, fé-fon 
s’échapper’, ‘compter sur 


(13) a. chaplé,  bondyé,fiyansé, pyé, zyé,  sósyé 
‘chapelet’, ‘dieu’, ‘fiancé’, ‘pied’, ‘yeux’, ‘sorcier’ 
b. bankoulélé, kyolé, matété, wélélé 
‘vacarme’, ‘ribambelle’, ‘plat à base de riz et crabe’, ‘brouhaha’ 


L'hypothése d'une conversion ne tient donc dans aucun cas. Comme le -é qui apparait 
sur le verbe correspond à du matériel phonologique supplémentaire par rapport au nom, 
et que la relation catégorielle et sémantique change, tout porte à croire que le verbe 
est morphologiquement plus complexe que le nom. Il faut donc faire l'hypothése d'une 
formation impliquant une suffixation verbale en —é sur bases nominales. 


3.2.2 L'impossible règle de formation des noms par suppression du -é 


Une autre hypothése aurait également pu étre envisagée, celle d'une régle de construc- 
tion de noms sur base verbale, par suppression du -é final du verbe (ou une « rétrofor- 
mation »). Mais cette hypothése rencontre plusieurs difficultés : 


(a) la premiére tient à ce que ce mode de formation est jugé traditionnellement rare 
dans les langues (sur la « subtractive morphology » ou « deletion » et sa rareté, 
voir ce qu'en disent les manuels, comme Anderson 1992 : 64-66 ; Haspelmath 2002 : 
24; Fradin 2003 : 47) 
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(b) la deuxiéme s'appuie sur des paires Nom/Verbe dont le nom est hérité du francais 
mais pas le verbe qui est nécessairement construit en créole (14); or ce verbe laisse 
apparaitre un -é final supplémentaire. 


(14) a. alyans — alyansé 
‘alliance’ ‘se lier’ 


b. bwa — bwaré 


‘bras’ 'enlacer' 

c. chikann — chikanné 
‘contestation’ ‘contester’ 

d. fèr — féré 
‘fer à cheveux’ ‘défriser les cheveux’ 


e. janm — janbé 


‘jambe’ ‘enjamber’ 
f tij — tijé 
‘bourgeon’ ‘bourgeonner’ 


Comme le nom est hérité du français, et le verbe construit en créole, le nom ne 
peut pas être dérivé du verbe par une règle de suppression du -é final du verbe; 
c’est bien le verbe qui est formé par suffixation sur la base du nom. 


(c) le troisiéme argument s'appuie sur l'absence de noms déverbaux créoles construits 
par suppression du -é d'un verbe hérité. En effet, notre corpus ne fournit aucun 
nom dérivé à partir de verbes hérités par simple suppression de la finale en -é La 
disparition de la finale en -é des verbes hérités peut avoir lieu à l'occasion d'une 
dérivation, mais uniquement lorsque la dérivation se fait par suffixation (voir par 
exemple, (15) pour la suffixation V— N en -é/-ez, (16) pour la suffixation V N 
en —qj, et (17) pour la suffixation V— N en -asyon). 


(15) a. fiyansez — fiyansé 


‘fiancée’ ‘se fiancer’ 
b. kouyonnèz <— kouyonné 
‘celle qui couillonne'  ‘couillonner” 
c. soutirèz «— soutiré 
MPO der ee Eoo ; An , 
celui qui couvre les bétises de qqun couvrir les bétises de qqun 


(16) a. bokantaj < bokanté 


‘échange’ ‘échanger’ 

b. diraj <— diré 
‘qui dure” ‘durer’ 

c. konblaj < konblé 
‘comblement’ ‘combler’ 
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17) a. pwofitasyon < pwofité 
P y p 


‘profit’ ‘profiter’ 

b. anmerdasyon + anmerdé 
emmerdement — 'emmerder' 

c. poursuivasyon <— poursuiv 
‘poursuite parle diable’ ‘poursuivre’ 


Une dérivation par conversion (18) n'imposera pas, quant à elle, la disparition de la 
finale vocalique du verbe. 


(18) a. déboulé / déboulé 
‘défilé’ ‘défiler rapidement’ 
b. lélé / lélé 
‘touillette’ 'touiller 
c. mayé / mayé 
‘mariage’ ‘se marier” 
d. pété / pété 
‘pet’ ‘faire un pet’ 


La voyelle finale du verbe disparaissant uniquement dans le contexte d’une dériva- 
tion dont le suffixe est à initiale vocalique, tout porte à croire qu’une contrainte morpho- 
phonologique est en jeu (contrainte d’évitement du hiatus, contrainte de taille...) et in- 
valide l'hypothése de l'existence d'une règle dérivationnelle de suppression. 


3.2.3 Conditions d'apparition 


Ces arguments conduisent à envisager que les paires de convers Nom/Verbe du français 
ont subi une réanalyse de telle sorte qu'en créole, la relation morphologique entre les 
noms et les verbes en -é de (13) ne relève pas d'une conversion, comme en français, 
mais d'une suffixation verbale sur base nominale (N— V). Ces paires ont été héritées 
en nombre suffisant pour avoir fait systéme et permis de former productivement, par 
analogie, d'autres verbes dénominaux suffixés par -é sur des bases françaises ou non 
francaises comme en (19). 


(19) a. bok — /boké 


‘affront’ ‘faire un affront’ 
b. chiktay  /chiktayé 
‘émiettage’ ‘émietter’ 
c. fer / féré 
‘fer à cheveux’ 'défriser 


d. lyann / lyanné 
‘liane’ ‘se servir d’un tuteur pour grimper’ 
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e. dousin / dousiné 

‘câlin? ‘câliner’ 
f. djòb / djobé 

‘petit boulot’ ‘faire un petit boulot’ 
g. plòk — / ploké 


cloque’ ‘avoir des cloques 


Ainsi, la réanalyse de ces paires Nom/Verbe héritées a abouti a la création d’un suffixe 
verbal -é en créole, inexistant dans la langue lexificatrice. Ce schéma morphologique est 
représenté sous (20) où X est mis pour le lexéme base (et non le radical qui peut subir des 
modifications phonologiques lors de la suffixation comme nous le présentons en § 3.3) : 


(20) Xy Xé, 


La création de ce schéma morphologique n'a rien d'inédit à travers les langues; il 
peut s'apparenter à ce que la littérature dédiée aux mécanismes et aux motivations du 
changement dans la formation des mots appelle « secretion » (Rainer 2015 : 1771). Ce 
concept repris à Jespersen (1922 : 384) , référe à un processus par lequel une séquence 
purement phonologique acquiert le statut de « morphéme » (phénoméne déjà signalé, 
selon Rainer 2015, par Bloomfield 1891, ou Lass 1990 qui parle de « exaptation »). 


By secretion I understand the phenomenon that one portion of an indivisible word 
comes to acquire a grammatical signification which it had not at first, and is then 
felt as something added to the word itself. (Rainer 2015 : 1771) 


Il peut également s'apparenter à un cas de « degrammaticalization » ou de « deinflec- 
tionalization » (Rainer 2015 : 1768-69) dans la mesure où la finale flexionnelle du verbe 
francais héritée (/e/) devient un suffixe dérivationnel. 

Quoi qu'il en soit, les conditions requises pour aboutir à la naissance du suffixe verbal 
dénominal -é en créole lui sont spécifiques. Nous stipulons qu'elles sont les suivantes : 


1) d'une part, la trés forte représentativité, dans le lexique créole, de couples mor- 
phologiques Nom/Verbe hérités du français où ils entretiennent une relation de 
conversion; 


2) d’autre part, au sein de ces couples, une trés forte majorité de verbes a finale en 
-é: 


Ce cas est à distinguer de ce que Haspelmath (1995 : 8-10) appelle « secretion » qui fait référence à une 
extension d'un affixe par l'incorportation d'une partie non affixale de la racine (schématisé sous (a)) 


(a) Affix secretion 
Xyz — xyz-a 
R= -za 


= new suffix -za, e.g. kim — klm-za 
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3) et enfin, la propriété des lexémes verbaux créoles de n'apparaitre que sous une 
forme unique : ainsi la marque flexionnelle des verbes hérités n'a pu étre interpré- 
tée comme flexionnelle en créole. 


C'estla conjonction de ces trois conditions qui a rendu possible la création de ce suffixe 
en créole guadeloupéen. Si l'une de ces conditions n'avait pas été remplie, il y a fort à 
parier qu'aucun nouveau schéma morphologique n'aurait pu voir le jour. Par exemple, 
tous les verbes créoles hérités du français remplissent la condition 3), mais seules les 
finales en -é des verbes hérités du français ont été réanalysées comme une règle de 
suffixation de verbes dénominaux. Cela tient aux conditions 1) et 2) réunies : seules les 
paires héritées du francais Nom / Verbe à finale en —é ont été héritées en grand nombre, 
à l'exception d'autres finales verbales. Toutes les autres paires Nom/Verbe apparaissent 
en nombre infime et la deuxiéme condition présentée ci-dessus n'est pas remplie. En 
effet, méme si le guadeloupéen compte un certain nombre de verbes hérités présentant 
une autre finale que —é (cf. la table ci-dessus), ces verbes soit ne sont reliés à aucun nom 
(comme (21) pour les verbes en - i), soit ils le sont, mais uniquement dans une relation de 
conversion (22 pour les verbes en (—i)), soit le nom relié est difficile à mettre en relation 
morphologique avec le verbe à cause d'une variation phonologique entre les deux trop 
importante (cf. (23) pour les verbes en - i). 


(21) a. abouti 
‘aboutir’ 
b. aji 
‘agir’ 
c. dégarni 
‘dégarnir’ 
d. flétri 
‘flétrir’ 
e. konstwi 


construire 


(22) a. amòrti /amòrti 
‘amortir ‘amorti’ 


b. anvi / anvi 
‘avoir envie ‘envie 


c. griji / griji 
‘ségratigner ‘égratignure’ 
d. jwi /jwi 
‘jouir ‘sperme’ 
e. vèrni /vèrni 
‘vernir ‘verni 


(23) a. chwa /chwazi 
‘choix’ ‘choisir’ 
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b. fen /fini 
‘fin ‘finir 
c. flè / fléri 


‘fleur’ ‘fleurir’ 


d. kous  /kouri 


‘course’ ‘courir’ 
e. trèt / trayi 


‘traître’ ‘trahir’ 


Finalement, les verbes hérités qui ne remplissent pas les conditions 1) et 2) ne donnent 
lieu à aucune création créole. Pour reprendre l'exemple des verbes en - i, les seuls de notre 
corpus qui ne soient pas hérités ne sont pas dérivés par un suffixe verbalisateur —i (24) : 


(24) a. bigidi 


‘faiblir’ 
b. bénékaki 

‘hésiter’ 
c. siri 


devenir aigre 


d. tini 
‘avoir’ 


Les trois conditions nécessaires à la création du suffixe -é ne sont pas propres au gua- 
deloupéen et se sont retrouvées dans d'autres créoles à base française. En effet, plusieurs 
créoles ont suivi le méme processus et la suffixation en -é compte parmi les schémas 
morphologiques disponibles du Haitien (DeGraff 2001, Lefebvre 1998, 2003) et du Saint- 
Lucien (Bhatt & Nikiema 2000, Brousseau 2011). Elle n'a néanmoins jamais fait l'objet 
d'études de détails dans les travaux portant sur ces créoles. 


3.2.4 Propriétés du suffixe verbal dénominal -é en créole 
3.2.4.4 Forme phonologique du suffixe 


Nous postulons que la forme phonologique du suffixe verbal dénominal ainsi créé est /e/ 
(orthographié —é). Cet affixe vocalique apparait dans certains contextes précédé d'une 
consonne, /t/ par défaut (cf. (25)) et il y a lieu de se demander si cette consonne à la 
frontiére entre le radical et le suffixe n'appartient pas au suffixe. Tout porte à croire 
néanmoins que la consonne intercalaire est de nature épenthétique, permettant, dans un 
contexte lexical, d'éviter la succession de deux voyelles à la frontiére entre la base et 


l'affixe. 


(25) a. konplo — konploté 
‘complot’ ‘comploter ' 
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b. niméwo — niméroté 


‘numéro’ ‘numéroter’ 
c. soulyé — soulyété 
‘chaussures’ ‘mettre des chaussures’ 


Un premier argument en ce sens est le fait que l'évitement du hiatus en créole gua- 
deloupéen s'observe régulièrement à la frontière morphologique dans les cas de dériva- 
tion : citons, à titre d'exemple, la formation de dérivés suffixés dont le suffixe à initiale 
vocalique entraîne la suppression de la finale vocalique du verbe en —é. Un deuxième 
argument est le développement d'autres stratégies d'évitement du hiatus en contexte 
morphologique, comme le recours à des régles de dérivation permettant de contourner 
le probléme, en l'occurrence la conversion ou la préfixation. On peut ainsi affirmer que 
la suffixation en -é entraine des changements phonologiques sur les bases nominales, 
dont les épenthéses ne sont qu'un exemple (voir Villoing & Deglas 2016a pour plus de 
détails) . 

La présence de toute autre consonne entre le radical et le suffixe relève de cas différents 
de l'épenthése consonnantique ou de l'allomorphie suffixale. Ainsi, 


(i) une réalisation spécifique des voyelles nasales en contexte de dérivation en guade- 
loupéen, comme dans d'autres créoles à base francaise (cf. Bhatt & Nikiema 2000), 
laisse apparaitre une consonne nasale à la suite de la voyelle nasale du radical lors 
de la suffixation en —é (cf. (26)); 


(ii) la réalisation de consonnes lexicales héritées des lexémes français qui se révèlent 
uniquement dans ce contexte dérivationnel (le suffixe protégeant la consonne), 
puisqu'elles ont par ailleurs disparu en finale (cf. (27)) : 


(26) a. boukan — boukanné 
‘feu de brindille’ 'griller au feu de bois’ 


b. dirèksyon — diréksyonné 
direction montrer la direction 


c. gidon  — gidonné 
'guidon' ‘mener’ 


d. losyon — losyonné 
lotion se parfumer 


e. migan — miganné 
‘purée’ ‘mélanger’ 
(27) a. arbit — — arbitré 
‘arbitre’ ‘arbitrer, trancher’ 
b. chalè — chaléré 
chaleur s'inquiéter 
c. janm — janbé 
jambe enjamber 
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te — téré 


terre enterrer 


penti — pentiré 


peinture peindre 


3.2.4.2 Propriétés sémantiques de la règle 


La relation sémantique entre le nom de base (désormais Nbase) et le verbe dénominal 
suffixé en -é apparait, pour une part, typique de ce type de construction morphologique 
en francais et pour une autre part originale. 

Elle est typique dans les cas où le Nbase renvoie aux actants du verbe comme l'instru- 
ment en (28) (qui comprend aussi bien les artefacts (28a) que les parties du corps (28b)), 
à un agent en (29), à une entité déplacée (locatum verbs, figure verbs) en (30a), au lieu du 
procès (location verbs, grounds-verbs) en (30a), et à l'objet résultant du procès en (31). 


(28) N:instrument 


a. 


(29) N: 


i fak | — faké 
‘béche’ ‘bécher’ 
ii kon — koné 
‘klaxon’ ‘klaxonner’ 
iii. graj — grajé 
‘rape’ ‘raper’ 
iv. pikwa — pikwaté 
pioche piocher 
i lang — langé 
langue embrasser avec la langue 


ii bwa — bwaré 
bras enlacer 


ii. zig — zigé 
‘position des doigts ‘faire une pichenette’ 
pour faire une pichenette’ 


iv. zyé | — zyété 
‘yeux’ ‘surveiller’ 
agent 


mako — makoté 
*mouchard' *moucharder' 


makrél — makrélé 
‘celle qui se méle de tout’ ‘surveiller’ 


mandyan — mandyanné 
€ à , € awy 
mendiant mendier 


139 


Florence Villoing & Maxime Deglas 


(30) 


(31) 


a. 


N : entité déplacée 
i. bonda — bondaté 
‘fesses’ ‘poser ses fesses’ 
ii janb — janbé 
‘jambe’ ‘enjamber’ 
iii. pyé — pyété 
‘pied’ ‘poser le pied’ 
iv. soulyé — soulyété 
‘chaussures’ ‘mettre des chaussures’ 
N: lieu final du procès 
i. balkon — balkonné 
‘balcon’ ‘être au balcon’ 
ii. kabann — kabanné 


‘lit’ ‘trainer au lit’ 


iii. kan — kanté 
‘côté’ ‘se mettre sur le côté, sur le flanc’ 


: objet résultant 


flang — flangé 

‘entaille’ ‘entailler’ 
migan — miganné 

‘purée’ ‘mélanger’ 

fifin — fifiné 

‘bruine’ ‘bruiner’ 

tij — tijé 
bourgeon bourgeonner 


La relation sémantique entre le Nbase et le verbe dérivé suffixé en —é est néanmoins 
atypique dans les exemples (32) où le Nbase dénote une situation dynamique (voir Vil- 
loing & Deglas 2016a pour une présentation des tests d'événementialité) : 


(32) 
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a. 


bonbans — bonbansé 


‘fête’ ‘faire la fête’ 
chikann — chikanné 
contestation contester 


chiktay — chiktayé 
emiettage emietter 
dousin — dousiné 
‘caresse’ ‘caresser’ 


driv — drivé 
‘promenade’ ‘promener’ 
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f. kalbann — kalbanné 
‘culbute’ ‘culbuter’ 


En effet, en français, les « noms d’événément » sont prototypiquement déverbaux 
et les cas de noms d'événement servant de base à la formation d'un verbe dérivé res- 
tent minoritaires. Par exemple, Corbin (2004) note, en francais, quelques verbes suffixés 
construits sur des noms simples dénotant des procès (guerroyer et satiriser construits 
sur les noms processifs, guerre et satire). Mais ces exemples sont forcément trés peu 
nombreux, 


— d'une part parce que les noms simples dénotant un événement restent rares dans 
le lexique du français (le concert, l'orage; ils représentent 8,1% des noms simples 
selon Tribout et al. 2014) et sont, en général, issus de noms déverbaux en latin ; 


— d'autre part, parce que si les bases des verbes en -iser peuvent être processives, ce 
cas se présente rarement, aussi bien en anglais (Plag 1999) qu'en francais (Namer 
2013); 


— et enfin, la suffixation en -oyer apparait peu productive. 


Cette rareté vient confirmer l'hypothése de Croft (1991) selon laquelle les noms dé- 
notent prototypiquement des objets. 

La situation semble étre différente lorsque les bases nominales processives sont elles- 
mémes complexes morphologiquement. En effet, quelques travaux récents sur le francais 
ont mentionné la relative disponibilité de certains noms construits dénotant des événe- 
ments à servir de base à la formation d'un verbe. Tribout (2010), par exemple, montre 
qu'un nombre non négligeable de verbes dénominaux convers sont formés sur des noms 
événementiels déverbaux (33) : 


(33) a. louer — louange — louanger 
b. vider — vidange — vidanger 

c. recevoir — réception — réceptionner 
d. frotter — friction — frictionner 


e. partir — partage — partager 


Tribout (2010) l'explique par le fait que le nom base a perdu sa motivation morpholo- 
gique et que la perception de sa construction sur base verbale n'existe plus (par exemple, 
(33c), (33d), (33e)). Mais pour d'autres paires, la relation entre le nom abstrait et son verbe 
base reste tout à fait transparente (par exemple, (33a), (33b)). 

C'est un résultat que partagent Lignon & Namer (2014) sur d'autres cas de conversion 
du français, les noms abstraits suffixés en —ion servant de bases à la formation de verbes 
convers, alors que ces noms sont construits sur des bases verbales facilement reconstruc- 
tibles (34) : 
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(34) a. attirer — attraction — attractionner 
b. intercéder — intercession — intercesser 
c. soumettre — soumission — soumissionner 


d. voir — vision — visionner 


Parallélement, une autre formation permet de construire des verbes sur des bases no- 
minales événementielles, la rétroformation à partir de composés néoclassiques (Namer 
2012) (cf. (35)). 


(35) a. photoémission — photoémettre 
b. hydromassage — hydromasser 


c. hydroextraction — hydroextraire 


Ainsi, la formation d'un verbe ayant pour base un nom d'événement en français (i) 
n'est disponible que pour des bases nominales morphologiquement construites et (ii) la 
régle impliquée est préférentiellement la conversion. Cette configuration spécifique ne 
se retrouve pas dans les données du créole guadeloupéen étudiées ci-dessus qui font état 
d'une régle de suffixation sur base nominale événementielle morphologiquement simple. 
Le créole présente donc une originalité sémantique par rapport au français tout à fait 
intéressante. Nous l'attribuons à la formation trés spécifique de la régle de suffixation en 
—é qui est issue de la réanalyse de paires Nom/Verbe du français relevant de deux règles 
de conversion : la conversion V— N et N— V. 


3.3 Réanalyse des paires N/V-préfixé en parasynthétiques 


L'absence de flexion verbale en créole guadeloupéen et l'héritage d'une forme unique du 
verbe francais (en l'occurrence, pour les verbes qui nous intéressent, la forme de l'infinitif 
ou du participe passé en /e/) entrainent d'autres réanalyses morphologiques. Ainsi, les 
paires héritées en (36), dont le verbe est formé en français par préfixation, ne peuvent 
s'analyser en créole qu'en terme de parasynthèse. 


(36) a. bó  /débórdé 
‘bord’ ‘déborder’ 
b. Dich / défriché 


‘friche ‘défricher’ 


c. kras  / dékrasé 
‘crasse’? 'décrasser' 
d. mayo / démayoté 
‘étoffe d'emmaillotage de bébé’ ‘démailloter’ 
e. rasin / dérasiné 
‘racine’ ‘déraciner’ 


Les paragraphes qui suivent argumentent en faveur de cette hypothèse et présentent 
les propriétés phonologiques et sémantiques associées à ce schéma morphologique qui 
est propre au créole. 


142 


6 Comment le créole réanalyse les dérivations du français 


3.3.1 dé-N-é, parasynthétiques 


Les exemples de paires morphologiques Nom/Verbe en (36) héritées du francais ne sup- 
portent pas la méme analyse morphologique en créole guadeloupéen et conduisent a 
envisager un nouveau cas de réanalyse morphologique. Là où, en français, l'analyse re- 
connait un dérivé verbal au moyen d’une préfixation en dé- sur base nominale, le créole, 
quant a lui, forme un verbe par parasynthése sur base nominale. 

Le raisonnement qui conduit à ce résultat est proche de celui qui a mené à identifier 
la création du suffixe dénominal verbalisateur —é : les verbes créoles ne se réalisant que 
sous une forme unique, la finale en -é appartient bien à la forme lexicale du verbe et 
ne correspond pas à l'affixe d'infinitif apparaissant dans la forme citationnelle du verbe. 
Ainsi, entre la base nominale et le verbe dérivé, du matériel phonologique supplémen- 
taire apparait aux deux extrémités : à gauche de la base, un préfixe dé-, et à droite de la 
base, le suffixe verbalisateur —é. Or ces affixes ne relèvent pas de l'application successive 
de deux régles morphologiques. En effet, ni le verbe en —é (37) ni le nom en dé- (38) 
n'existent indépendamment l'un de l'autre. 


(37) a. *bódé, — débódé, 
*déborder' 
b. *friché, — défriché, 
‘défricher’ 


c. *krasé, — dékrasé, 
‘décrasser” 


d. * mayoté, — démayoté, 
‘démailloter’ 


e. “rasiné, — dérasiné, 


‘déraciner’ 

(38) a. *débód, — débòdé, 
*déborder' 
b. *défrich, — défriché, 
‘défricher’ 


c. *dékras, — dékrasé, 
‘décrasser’ 
* La # DH 
d. *démayo, — démayoté, 
‘démailloter’ 
e. *dérasiny — dérasiné, 
‘déraciner’ 


Ainsi, les exemples en (36) ne peuvent ni être analysés comme des préfixés en dé- sur 
base verbale (le verbe n'existe pas), ni comme des verbes suffixés en —é sur base nominale 
(ces bases n'existant pas non plus). Ces propriétés rappellent les critéres traditionnelle- 
ment avancés pour reconnaître une parasynthése (cf. Darmesteter 1894 : 24 présentés 
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ci-dessus au § 2.1, Corbin 1987 : 121-125, Fradin 2003 : 288-306). Comme la seule relation 
morphologique possible est celle existant entre le Nom base et le Verbe dérivé, et qu’elle 
se manifeste par une préfixation et suffixation simultanée (préfixation en dé- et suffixa- 
tion en —é), alors on est en droit de faire l'hypothése d'une réanalyse en guadeloupéen 
des paires Nom/Verbe-préfixé du francais en parasynthétiques créoles. 

De même que les paires Nom/Verbe à finale en -é présentées en section 3.2, les paires 
Nom/Verbe à initiale en dé- héritées l'ont été en grand nombre et le schéma morpholo- 
gique créé à l'issue de cette réanalyse est devenu productif en créole, comme l'attestent 
les créations de (39) : 


(39) a. chépi — déchépiyé 


‘charpie’ ‘mettre en charpie’ 
b. chouk — déchouké 

souche déraciner 
c. pat — dépaté 

‘main de banane’ ‘retirer les mains du régime de banane’ 
d. tik — détiké 

tique retirer les tiques 


e. zo — dézosé 
os désosser 


A limage des paires héritées réanalysées de (36), les créations créoles de (39) s’ana- 
lysent comme des formations verbales parasynthétiques dans la mesure où ni le verbe 
en -é (40) ni le nom en dé- (41) n'existent indépendemment l'un de l'autre : 


(40) a. *chépiyé — déchépiyé 
‘mettre en charpie’ 
b. *chouké — déchouké 
‘déraciner’ 
c. * paté — dépaté 
retirer la main de bananes du régime 
d. *tiké — détiké 
retirer les tiques 
e. "zosé — dézosé 
désosser 
(41) a. *déchépi — déchèpiyé 
‘mettre en charpie’ 
b. *déchouk — déchouké 
‘déraciner’ 
c. *dépat — dépaté 
retirer la main de bananes du régime 
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d. *détik — détiké 

retirer les tiques 
e. * dézo — dézosé 

‘désosser’ 


Ainsi, les conditions requises pour aboutir à la naissance du schéma morphologique 
(42) en créole guadeloupéen, que nous avons posées au § 3.2.3 sont ici aussi respectées : 


1) la très forte représentativité, dans le lexique créole, de couples morphologiques 
Nom/Verbe à initiale en dé- hérités de préfixations verbales dénominales du fran- 
çais; 


2) une quasi-totalité de verbes à finale en —é, au sein de ces couples; 


3) et enfin, la propriété des lexémes verbaux créoles de ne présenter qu'une forme, 
la marque flexionnelle des verbes hérités n'ayant pas été interprétée comme telle 
en créole. 


Nous pouvons ainsi poser que le créole guadeloupéen dispose d'un schéma morpholo- 
gique de parasynthése du type (42), où X représente le lexéme de base, de type nominal, et 
dé—...-é l'affixe parasynthétique (circumfixe) formant des verbes. Ce schéma rend aussi 
bien compte des paires Nom/Verbe héritées du francais de (36) que de celles construites 
en créole en (39) : 


(42) Dé-X,-6, 


3.3.1.1 Forme phonologique de l'affixe 


La forme phonologique de l'affixe parasynthétique est /de-X-e/ (que nous orthographions 
dé-X-é), où X représente la base nominale et dé- ... -é l’affixe. Les possibles consonnes 
qui s'intercalent à droite, entre le radical de base et le suffixe -é sont à analyser comme 
des consonnes épenthétiques dans un contexte lexical gauche vocalique, à l'image de 
ce que nous avons observé pour la suffixation en —é (cf. § 3.2.3), que ce soient pour les 
paires héritées (cf. (43a)) ou pour les paires créoles pour lesquelles nous n'observons 
qu'un exemple (43b) : 


(43 a i bó — débòdé 
‘bord’ ‘déborder’ 


ii. figi — défigiré 


‘visage’ ‘défigurer’ 
iii. ma — dématé 

‘mât de bateau’ ‘démâter (bateau), renverser, retourner’ 
iv. mayo — démayoté 

‘maillot, étoffe pour ‘démailloter’ 


emmailloter un nouveau-né’ 
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V. zo — dézosé 
os désosser 
b. chépi — déchèpiyé 


‘charpie’ ‘mettre en charpie’ 


L'allomorphie typique que présente le préfixe dé- en frangais et dont a hérité le préfixe 
dé- créole (dé- devant verbe à initiale consonantique et déz- devant verbe à initiale 
vocalique; cf. (44a) pour les paires héritées du français et (44b) pour les exemples de 
création créole) ne se retrouve pas dans notre corpus de parasynthétiques dé-X-é. 


(44) a. i dézabityé ^ «— abityé 
*déshabituer' ‘s'habituer’ 


ii. dézakordé <— akordé 
désaccorder accorder 


iii. dézanbalé <— anbalé 
‘déballer’ ‘emballer’ 


iv. dézanbwaté <— anbwaté 


'désemboiter  ‘emboîter’ 
v. dézankonbré < ankonbré 
défaire ce qui était encombré occuper à l'excés un lieu 


b. i. dézantótiyé — antotiyé 
*détortiller' 'entortiller' 


ii. dézanbaglé < anbaglé 


‘débarasser’ 'encombrer une table, un meuble’ 
iii. dézanrajé <— anrajé 
< EE UU s j 
ne plus étre faché, enragé avoir la rage 
iv. dézapiyé < apiyé 
‘interrompre l’action des'appuyer ‘appuyer’ 


En effet, nous ne relevons aucun verbe parasynthétique construit sur base à initiale 
vocalique. Les seules données qui auraient pu paraître pertinentes sont les hérités dé- 
zosé ‘désosser’ et dézérbé ‘désherber’, mais ils sont analysables en créole sur les bases 
nominales zo ‘os’ et zèb ‘herbe’ à initiale consonantique. 


3.3.1.2 Propriétés sémantiques : sens privatif 


Le sens le plus saillant associé à cette formation parasynthétique est ce que la littérature 
sur les créoles appelle couramment le « sens privatif » régulièrement reconnu pour les 
formations identiques dans d’autres créoles (cf. Chaudenson 1996 : 27; Filipovich 1987 
: 44; DeGraff 2001 : 78-80, Lefebvre 2003 : 6-8; Brousseau 2011 : 70-71). Cette valeur 
sémantique peut être considérée comme héritée du français où elle est déjà identifiée 
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comme propre au préfixe verbalisateur dé- sur base nominale (cf. Corbin 1987 : 62-63 et 
252, par exemple). Plus précisément, ce sens privatif s'inscrit dans une relation spatiale 
entre le nom de base et le verbe dérivé, relation que les auteurs francophones repré- 
sentent au moyen de la terminologie cible/site de Vandeloise 1986 (qui correspondent 
aux oppositions figure/ground ou trajector/landmark de la sémantique cognitive, cf. Fra- 
din 2003 : 298, Amiot 2008 : 10, Jalenques 2014 : 1783). La base nominale de la préfixation 
en dé- du francais peut aussi bien dénoter la cible que le site de la relation. 


(i) Lorsque la base dénote le site de la relation, le verbe désigne l'action de « sortir 
de ce que désigne la base » (Jalenques 2014 : 1782) (ce que Corbin 1987 paraphrase 
par « enlever de X ») : déterrer, dérailler, débarquer etc. 


(ii) Lorsque la base dénote la cible de la relation, le verbe désigne l'action « d'enlever 
ce que désigne la base » (Jalenques 2014 : 1782) (ce que Corbin 1987 paraphrase par 
'enlever X") : désosser, déneiger, dépoussiérer, déminer etc. 


Le créole guadeloupéen, en réanalysant les paires Nom/Verbe-préfixé-en-dé héritées 
du francais, construit de facon privilégiée des parasynthétiques dé-N-é, dans lesquels le 
nom de base (désormais Nbase) dénote la cible de la relation (45) : 


(45 a. chouk — déchouké 
‘souche’ ‘déraciner’ 


b. jouk — déjouké 


joug enlever le joug 
c. pat — dépaté 
‘main de banane’ ‘retirer les mains du régime de banane’ 


d. pyèt  — dépyété 

‘pattes’ ‘retirer les pattes (crabe) 
e. tik — détiké 

‘tique’ ‘retirer les tiques’ 


Comparativement, les parasynthétiques créoles dé-N-é, dont le N dénote le site de 
la relation sont très faiblement représentés dans notre corpus qui ne comprend que les 
exemples (46) : 


(46) a. bous — débousé 


‘bourse’ ‘dépenser’ 
b. tab — détablé 
‘table’ ‘enlever les couverts d'une table’ 


Cette tendance est largement confirmée par les triplets N / N-é,/dé-N-é, (hérités 
ou créoles) dont le schéma de construction n'est pas immédiatement transparent (V 
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— déV, ou N — dé-N-é,?)’ mais dont les dé-N-é, sont compatibles avec une inter- 
prétation privative ot le nom (N) serait la cible de la relation (47) : là encore, ils sont 
bien plus nombreux que ceux dont le nom serait le site de la relation (cf. les exemples 
uniques de (48)) : 
(47) a. bach /baché / débaché 

‘bache ‘bacher’ ‘débacher’ 

b. grés / gresé / dégrésé 
'graisse' 'graisser' 'dégraisser, enlever la graisse' 


c. kabós / kabosé / dékabosé 


‘bosse’ ‘déformer’ ‘débosseler’ 


d. nat / naté / dénaté 
natte' 'natter des cheveux’ ‘enlever les nattes 


e. sel /salé / désalé 

‘sel’ ‘saler’ 'dessaler' 
kof / kofré / dékofré 
‘coffre’ ‘coffrer’ 'décoffrer 


b. kouch / kouché / dékouché 


‘lit’ ‘se coucher’ ‘découcher’ 


(48) 


e 


c. plas /plasé / déplasé 
‘place’ ‘placer’ ‘déplacer’ 

d. té / téré / détéré 
‘terre’ 'enterrer' ‘déterrer’ 


e. kwen / kwensé / dékwensé 
‘coin’ 'coincer' 'décoincer 


La raison de cette nette préférence tient certainement au fait que les paires héritées 
du francais présentent aussi majoritairement cette relation sémantique entre le nom et 
le verbe (49) comme l'atteste la trés faible représentation (3 paires uniquement), au sein 
de notre corpus, de paires de parasynthétiques dé-N-é, dont le N désigne le site de la 
relation (50) 


(49) a. fey — déféyé 
‘feuilles’ ‘ôter les feuilles’ 


b. fówm — défówmé 
forme déformer 


c. kouraj — dékourajé 
‘courage’ ‘décourager’ 


TEn effet, dans le cas des triplets, la difficulté tient à ce que l'on ne parvient pas toujours à identifier si 
le dérivé s'est construit sur le verbe par préfixation ou sur le N par parasynthése; comme l'a noté Corbin 
(1987 : 63) et Amiot (2008 : 12), il existe des « cas d'ambiguité catégorielle » dont l'interprétation sémantique 
est compatible avec les deux constructions (par exemple : débwasé ‘inverse de boiser' ou 'enlever le bois’). 
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d. kras  — dékrasé 


crasse décrasser 
e. mayo — démayoté 
‘maillot, étoffe pour emmailloter le nouveau-né’ ‘démailloter’ 


(50) a. bò  — débòdé 
‘bord’ ‘déborder’ 


b. bous — débousé 


‘bourse’ ‘dépenser’ 
c. moul — démoulé 
‘moule du gâteau” ‘sortir du moule un gâteau’ 


3.3.1.3. Propriétés sémantiques : autres sens minoritaires 


Parallèlement, d’autres sens émergent en créole mais en très faible proportion, reflétant 
là encore leur faible représentativité dans les paires et les triplets hérités du français : 


(i) le Nbase représente l’objet résultant du procès 
(51) a. chépi — déchèpiyé 
‘charpie’ ‘mettre en charpie’ 


b. gout — dégouté 
‘goutte’ ‘couler goutte à goutte’ 


c. kal — dékalé 


‘raclée’ ‘tabasser’ 


d. katyé — dékatyé 
‘morceau’ ‘couper en quartier” 


(ii) le Nbase représente l’objet déplacé lorsque le verbe réfère à une localisation ((52a) 
pour les paires créoles, (52b) pour les paires héritées du français) 


(52) a. kaz — dékazé 


‘maison’ ‘déplacer une case à l’aide d’un véhicule 
pour l'installer ailleurs’ 
b. ménaj — déménajé 
'ensemble des meubles, des objets ‘déménager’ 


nécessaires à la vie domestique’ 


3.3.2 Dé-V préfixés 


Ces formations par parasynthèse doivent être distinguées des préfixations en dé- sur 
base verbale qui (i) soit réfèrent au procès inverse de celui que désigne la base (53), (ii) 
soit ne déclenchent aucun changement sémantique relativement à la base verbale (54). 
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(53) a. ankayé — dézankayé 

‘se prendre dans les récifs ‘enlever des récifs 
(pour un hamecon)’ coraliens’ 

b. baké — débaké 
‘embarquer’ ‘débarquer’ 

c. faché — défaché 
‘être faché’ ‘ne plus être fâché’ 

d. manché — démanché 
‘mettre un manche’ ‘ôter le manche’ 


e. rèspèkté — dérèspèkté 
‘respecter’ ‘manquer de respect’ 
(54) a. chalviré — déchalviré 
‘chavirer’ 
b. chiktayé — déchiktayé 
‘émietter, mettre en charpie’ 
c. libéré — délibéré 
libérer (qqun de prison)’ 
d. rifizé — dérifizé 
‘refuser’ 
e. viré — déviré 
‘tourner en sens inverse’ 


Bien qu'elles présentent a priori des segments phonologiques initiaux et finaux iden- 
tiques (le préfixe dé- et la finale verbale en —é) les préfixations sur base verbale se dis- 
tinguent des parasynthétiques par le fait de ne dériver d'aucun nom. Concomitante à 
cette différence de construction, se retrouve la relation sémantique entre la base et le 
dérivé. 


3.3.2.1 Préfixation dé-V à sens inversif 


Dans la majorité des cas, la préfixation en dé-V construit un sens non pas privatif mais 
inversif, comme le reconnaissent les travaux sur les créoles haitien et saint-lucien. Le 
sens inversif est différemment appréhendé par les auteurs ayant travaillé sur le frangais. 
Si l'on s'en tient aux travaux les plus récents, par exemple de Jalenques (2014 : 1778) 
qui suit la description proposée par Gerhard-Krait (2000), les verbes préfixés par dé- et 
construits sur base verbale présentent trois acceptions : 


a) inversion du résultat du procés exprimé par la base verbale (en lien à ses complé- 
ments éventuels) : dénouer sa cravate = agir de telle sorte qu'on annule le résultat 
de « nouer la cravate » ; 
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b) l'inverse du procés (non résultatif) exprimé par la base : décroitre = l'inverse de 
croître ; 


c) la négation du procès (non résultatif) exprimé par la base : déplaire = ne pas plaire. 


Les paires Verbe / dé-V,, héritées du français par le créole sont trés largement majori- 
tairement du type a) ou b) (55). 


(55) a. débatizé  / batizé 
‘débaptiser’ ‘baptiser’ 
b. déchosé / chosé 


‘déchausser’ ‘chausser’ 


c. dégonflé / gonflé 
‘dégonfler ‘gonfler’ 


d. démayé / mayé 
‘démarier’ ‘marier’ 

e. dézabiyé / abiyé 
déshabiller ‘s'habiller’ 


f. dézankastré / ankastré 
défaire ce qui était encastré ‘encastrer 


g. dézantòtiyé / antòtiyé 
‘détortiller’ ‘entortiller’ 
Les paires créoles sont aussi largement de type a) : 
(56) a. dégaré / garé 
‘sortir de la place de garage, de stationnement’ ‘garer, stationner’ 


b. dékouvè / kouvè 
‘découvrir’ ‘couvrir’ 


c. dékòviyé / kòviyé 
‘détordre, remettre en position initiale ‘tordre’ 
d. dépayé / payé 
‘annuler un par? ‘parier’ 


e. dézanbaglé / anbaglé 


‘débarrasser’ ‘encombrer une table, un meuble’ 
On ne recense dans le corpus qu’un exemple de type c) cf. (57) 


(57) dérèspèkté / rèspekté 
‘manquer de respect, y compris sexuellement’ ‘respecter’ 
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Les données nous conduisent donc a envisager que le créole, ayant hérité des paires 
V/ dé-V, les plus disponibles du francais -celles à valeur inversive-, a formé sur ces 
paires, par analogie, les dérivés créoles. Le sens inversif est donc probablement hérité 
de la préfixation en dé- du francais. Néanmoins, cette valeur inversive reste cantonnée 
aux préfixés sur base verbale et n'est représentée dans aucun exemple de parasynthé- 
tiques en dé-N-é,. Ainsi, les deux schémas morphologiques semblent s'étre spécialisés 
sémantiquement en créole : 


— le sens privatif est réservé à la parasynthése dé-N-é, (même si d'autres valeurs 
sémantiques sont possibles); 


— le sens inversif est spécifique à la préfixation dé-V. 


Cette spécialisation sémantique pourrait permettre de trancher l'analyse des triplets 
N / V / dé-N-é, qui apparaissent en bien plus grand nombre dans notre corpus que les 
parasynthétiques dé-N-é, et les préfixés dé-V,,, tant pour ceux hérités du français (58) 
que ceux construits en créole (59). 


(58) a. apui /apiyé / dézapiyé 
‘appui 's'appuyer ‘ne pas s’appuyer’ 
b. bwa /bwazé / débwazé 
‘bois’ ‘boiser’ ‘déboiser’ 
c. klou / klouwé / déklouwé 
‘clou’ ‘clouer’ ‘enlever les clous’ 
d. pengn /pengné / dépengné 
‘peigne’ ‘peigner’ ‘dépeigner’ 
e. tach /taché / détaché 


‘tache’ ‘tacher’ ‘détacher’ 


(59) a. bonda /bondaté / débondaté 
‘fesses’ ‘s'asseoir’ ‘se lever’ 


b. bwa /bwaré  /débwaré 
bras’ ‘enlacer’ ‘désenlacer 


c. grij / griji / dégriji 

‘fronce ‘faire des fronces' ‘retirer les fronces' 
d. lyann /lyanné / délyanné 

‘union’ ‘s'unir’ ‘se désunir’ 
e. janm / janbé / déjanbé 

jambe’ ‘enjamber’ ‘procès inverse d'enjamber 
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3.3.2.2. Préfixation dé-V sans changement sémantique 


Les formations par parasynthése dé-N-é, doivent, également, étre distinguées des pré- 
fixations en dé- sur base verbale (dé-V,) qui, à la différence des précédentes ne s'ac- 
compagnent d'aucun changement sémantique (cf. en (60) les paires V/dé-V,, héritées du 
français et en (61) celles construites en créoles) : 


(60) a. partajé — départajé 

‘partager’ 

b. plimé — déplimé 
‘plumer’ 

c. tranpé — détranpé 
‘tremper’ 

d. vidé — dévidé 
vider 

e. pozé — dépozé 
déposer, remettre à sa place 

(61) a. bwété — débwété 

‘boîter, marcher en boitant’ 

b. chiktayé — déchiktayé 
‘émietter, mettre en charpie’ 

c. rifizé — dérifizé 
‘refuser’ 

d. sòti —> désóti 
sortir 

e. viré — déviré 
‘tourner en sens inverse’ 


Cette absence de variation sémantique associée à la préfixation n’a rien de particulier 
au créole puisqu’elle est observée en français (Muller 1990, Gerhard-Krait 2000, Apothé- 
loz 2007, Jalenques 2014) (62) et dans d’autres créoles à base française comme le haitien 
(Filipovich 1987, Lefebvre 2003, Valdman 1981) (63) ou le saint-lucien (Brousseau 2011 : 
74). 


(62) couper — découper 
doubler — dédoubler 


c. marquer — démarquer 


TOP 


R- 


passer — dépasser 
e. verser — déverser 
(63) a. chiré — déchiré 


‘déchirer’ 
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b. chifonnen — déchifonnen 
‘froisser’ 

c. gengole — dégengole 
'se précipiter' 

d. grennen — dégrennen 


‘égrener’ 


Une analyse souvent évoquée, tant pour le français que pour le créole, est l'éventua- 
lité d’une valeur intensive du préfixé en dé- relativement au verbe de base. Bien que 
cette valeur soit justifiée ponctuellement, elle ne peut tenir pour l'ensemble des cas (voir 
critique de Jalenques (2014 : 1779) pour le français et de DeGraff (2001) pour le créole). 
Quoiqu il en soit, cette propriété ne touche pas les parasynthétiques dé-N-é,. 


4 Conclusion 


Le développement, en créole guadeloupéen, de deux schémas morphologiques de forma- 
tion de verbes par affixation (la suffixation verbale dénominale en -é (N-é,) et la para- 
synthèse verbale dénominale dé-N-é,) ), est issu de la réanalyse de paires Nom / Verbe 
héritées du francais. Les conditions nécessaires à ces réanalyses s'ancrent crucialement 
dans la propriété des lexémes guadeloupéens de ne se réaliser que sous une forme unique. 
En effet, la majorité des verbes hérités du français présentent un -é final probablement 
issu des formes fléchies de l'infinitif ou du participe passé du verbe francais d'origine. 
Or, c'est ce -é final, qui, dans le contexte des paires Nom/Verbe où il apparait, est ré- 
analysé comme un suffixe dérivationnel, faisant ainsi émerger deux nouveaux schémas 
morphologiques en créole, inexistants en francais. En somme, l'application de la notion 
de lexéme à l'analyse des données créoles permet de reconnaire la validité de ces sché- 
mas morphologiques en guadeloupéen alors qu'elle avait conduit à remettre en cause la 
pertinence de ces mêmes schémas pour les données correspondantes en français. 

Ces deux exemples de réanalyse nous conduisent à réfuter la position qui soutient 
que la dérivation n'émerge que via une grammaticalisation graduelle (cf. par exemple 
McWhorter 1998). Les données du créole guadeloupéen que nous avons examinées nous 
incitent plutót à suivre la proposition de Rainer (2015) selon lequel la grammaticalisation 
n'est qu'un des mécanismes du changement morphologique parmi d'autres, la réanalyse 
en étant un autre. 

Le mécanisme de la réanalyse, qui n'est pourtant pas propre aux langues créoles, y 
prend néanmoins une place importante du fait de la part massive qu'y occupe le lexique 
hérité du français. En témoignent d'autres schémas morphologiques tels que la suffixa- 
tion en -asyon en guadeloupéen (anmerdasyon ‘tracas’ ' <— anmerdé ‘emmerder’ ; pwofi- 
tasyon ‘action d'abuser de la faiblesse de qqun’ <— pwofité ‘profiter de la faiblesse de l'au- 
tre’), dont la forme phonologique du suffixe est le résultat de l'amalgame de la finale du ra- 
dical du verbe de base et du suffixe —ion des verbes hérités du francais (admirasyon 'admi- 
ration’ / admiré ‘admirer’; ógmantasyon ‘augmentation’ / ógmanté ‘organiser’) (Villoing 
& Deglas 2016b). 
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Chapter 7 


Some remarks on clipping of deverbal 
nouns in French and Italian 


Pavel Stichauer 
Charles University, Prague 


This chapter deals with the restricted class of clipped deverbal nominals in French (e.g. in- 
troduction — intro) and especially in Italian (e.g. giustificazione — giustifica) and aims to 
show that subtle semantic restrictions seem to constrain such clipping, although there are 
some differences between the two languages. First, I introduce the well-known distinction 
between event (E) and result/referential (R) nouns that has been further elaborated by Mel- 
loni (2006, 2007, 2011). I then proceed to discuss a class of formations where clipping seems 
to be sensitive to a special result/object meaning which is very close to what Pustejovsky 
(1991: 174; see Melloni 2011: 109, 111, 142) calls information object. On the basis of a limited 
class of examples (both attested and hypothetical, e.g. quantificazione —> quantifica), I ar- 
gue that where there is such an information object reading available to the relevant nominal, 
the clipping rule may apply. I take these phenomena to be relevant for Fradin & Kerleroux’s 
(2009: 84-86) Maximal Specification Hypothesis, according to which word-formation rules 
can apply, especially in the case of polysemous lexemes, to specific semantic features in- 
herent in the overall meaning of the base. I demonstrate that clipping can have access to 
precisely these semantic features. 


1 Introduction 


It is widely held that morphological phenomena such as clipping (or truncation and 
blending) can be well explained within a sociolinguistic or pragmatic framework where 
specific stylistic, diaphasic and/or diastratic factors are at work. Under this view, the only 
morphologically relevant issue would be that of phonological conditions and constraints 
on the bases. Nevertheless, there have recently been some attempts to show that there 
might also be specific semantic constraints that, in some cases, rule out the possibility 
of such morphological reduction, regardless of any pragmatically constrained context. 
Such studies demonstrate that truncation may operate in a highly systematic way that 
involves access to specific semantic information of a given base. 
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N In Olivier Bonami, Gilles Boyé, Georgette Dal, Hélène Giraudo & Fiammetta Namer 
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In this chapter, I intend to show that, within the restricted class of clipped dever- 
bal nominals in French (e.g. introduction — intro) and especially in Italian (e.g. giusti- 
ficazione — giustifica), which will be the focus of the present text, special and subtle 
semantic restrictions seem to constrain the availability of these formations, though the 
two languages do not cover exactly the same group of formations. 

In what follows, I will assume the traditional, though much debated, distinction be- 
tween inflection and derivation (see, e.g., Spencer 2013: 38-43). Such a distinction is fun- 
damental in that it posits two different roles of morphology: inflectional morphology is 
supposed to realize the inflected forms of a given lexeme, while derivational morphol- 
ogy serves to create new lexemes.! However, the difficulty of the topic to be tackled in 
the following pages lies precisely in the fact that clipping (or truncation) does not always 
seem to deliver an entirely new lexeme. 

I shall argue, following Fradin & Kerleroux’s (2009: 84-86) Maximal Specification Hy- 
pothesis, that word-formation rules apply, especially in the case of polysemous lexemes, 
to specific semantic features inherent in the overall meaning of the base, and that clip- 
ping can have access to precisely these semantic features. 

The text is organized as follows. In Section 2, I first lay out the well-known distinction 
between event (E) and result/referential (R) nouns that has been further elaborated by 
Melloni (2006, 2007, 2011) and that, at first sight, seems to capture some of the known 
cases. In Section 3, I briefly comment on the French data taken from Kerleroux (1997), Fra- 
din & Kerleroux (2003), and Fradin (2003). In Section 4, I take up the Italian data, based on 
Thornton (1990, 2004), Stichauer (2006), and Montermini & Thornton (2014) which are, 
in some fundamental aspects, different with respect to French. In Section 5, I conclude 
by putting forward a (falsifiable) hypothesis according to which such deverbal nouns 
are liable to undergo clipping only when special semantic and pragmatic conditions are 
met. I point out that, contrary to what is usually assumed (especially for Italian), the 
shortened forms may not always be completely synonymous with their “full” parental 
nominals. 


2 Event/Referential nouns and clipping 


Since Grimshaw (1990), the distinction between complex event nouns, simple event nouns 
and result nouns has become widely accepted, though there has been much critical dis- 
cussion about the various criteria that Grimshaw herself proposed to individuate the 
three groups (see Melloni 2011: 21-34). 

It has also been thought that only complex event nouns can give rise to various result 
interpretations where the result reading is normally associated with the outcome of the 


!Inflectional morphology provides the word forms inhabiting the cells in the lexeme's paradigm. [...], a 
derivational process defines a new lexeme, which may well have a completely new set of inflectional prop- 
erties. Therefore, derivational morphology cannot be defined using the same machinery as inflectional 
morphology, because a derived lexeme is not paradigmatically related to its base and cannot be considered 
a word form of anything. Rather, it defines an entirely new set of (possibly inflected) word forms. (Spencer 
2013: 2). 
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corresponding complex event noun. Traditional examples of such event/result (E/R) am- 
biguity are given in (1), where the English examples (1a., 1b.) are given an equivalent 
version in Italian (1c., 1d.) and French (1e., 1f.). 


(1) a. The construction of that house (by the company) took place forty years ago 
—E 


b. The construction is breathtaking — R 


c. La costruzione di quella casa (da parte dell'impresa) ebbe luogo quarant'anni 
fa—E 


d. La costruzione è molto bella — R 


e. La construction de la maison (de la part de la compagnie) a eu lieu il y a quar- 
ante ans — E 


f. La construction est trés belle —^ R 


Simple event nouns (e.g. party), instead, do not have an associated event structure, so 
that the event/result polysemy is not available. Moreover, simple event nouns are said 
to pattern with result nouns in that they share the same set of properties (see Melloni 
2011: 24-25). In what follows, I will assume the general divide between an event-based 
reading and result-based reading of the derived nominals, discussing various problems 
in due course. 

When it comes to clipping, the general divide between E/R nominals turns out to be 
relevant as there are specific constraints on the semantic status of the deverbal noun. 
In fact, as Kerleroux claims (1997: 155), “nouns denoting complex events may not be 
apocopated”. 

However, as we shall see, the situation is more complicated since there are more subtle 
semantic conditions that allow for clipping. More precisely, the clipping rule seems to 
eliminate the possibility of event noun interpretation (E) regardless of the fact whether 
the affected noun is a complex event or simple event nominal. Rather, what is required 
is a specific result/object - or referential (R) denotation of the corresponding deverbal 
noun, as illustrated in (2). 


(2) French 
a. Larécupération des naufragés fut longue — E 


b. * La récupe des naufragés fut longue — *E? 


"Ihe rescue operation of the shipwrecked was long' 
c. Jai des récupérations à prendre avant Noël — R 


J'ai des récupes à prendre avant Noël — R 
‘I have some extra days of holiday to take before Christmas’ 


e. Ils'oppose à l'introduction du loup à Paris — E 


Georgette Dal (p.c.) observes that, on the Internet, we can easily find some examples of the eventive reading 
as well, such as “La recup(e) a été longue car j'avais une centaine de courriers à récupérer? 
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f. * Il s'oppose à lintro du loup à Paris — "E 
‘He is against the introduction of the wolf into Paris’ 
g. Il a apprécié l'introduction de ton livre — R 


h. Ila apprécié l'intro de ton livre — R 
‘He enjoyed the introduction of your book’ 


As far as Italian is concerned, the situation is more intricate. Following Thornton 
(2004) and Montermini & Thornton (2014), a distinction must be made between those 
deverbal nouns in -a which are the result of the unproductive process of conversion 
(e.g., la conquista, la sosta, la firma etc.), and the apparently identical deverbal nouns in 
-a such as bonifica, condanna, conferma whose (diachronic) origin is to be sought in the 
truncation of the actional suffix -zione (see Montermini & Thornton 2014: 187 ff.). 

Although the diachronic account is surely on the right track, synchronically the be- 
haviour of pairs of full vs. clipped formations is far from being identical. As I will argue 
below, it is worth drawing a distinction between three groups. 

The first group comprises the pairs of formations which seem to be totally interchange- 
able displaying (purportedly) absolute synonymy, such as modificazione / modifica (3), 
where both forms display regular E/R ambiguity: 


(3) Italian 

a. La modificazione del testo (da parte dell'autore) é stata molto lunga — E 

b. La modifica del testo (da parte dell'autore) é stata molto lunga — E? 
"Ihe modification of the text (by the author) took a long time' 

c. La modificazione del testo sarebbe subito saltata fuori — R 

d. La modifica del testo sarebbe subito saltata fuori — R 
"Ihe modification of the text would have surfaced immediately' 

e. La modificazione (del testo) é sul tavolo — R 

f. La modifica (del testo) è sul tavolo — R 
"Ihe modification (of the text) is on the table' 


The second group involves partly synonymous formations in which the difference is 
claimed to lie exclusively at the stylistic level, such as giustificazione / giustifica (4), but 
which may display deeper semantic differences, as I will show, especially when it comes 
to the difference between an event vs. referential reading. In fact, as the examples in (4) 
show, the event reading of the clipped form tends to be rather unacceptable. 


(4) Italian 


a. Leripetute giustificazioni dell'assenza (da parte degli studenti) sono intoller- 
abili > E 


?In French, the clipped form la modif would also seem to be possible as some examples from the Internet 
show, such as “ceux qui sont grisés apparaissent comme dégrisés aprés la modif du texte” (Georgette Dal, 
p.c). 
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b. * Le ripetute giustifiche dell'assenza (da parte degli studenti) sono intollerabili 
—"*E 
‘The frequent justifications for absence (on the part of the students) are in- 
tolerable’ 


c. La giustificazione dell’assenza è falsa — R 


La giustifica dell'assenza è falsa — R 
"Ihe justification for absence is false' 


e. La giustificazione è sul tavolo —^ R 


La giustifica è sul tavolo —^ R 
“The justification is on the table’ 


Finally, a third group, explicitly not addressed in the literature, would involve impos- 
sible, unacceptable formations where the clipping of the suffix is disallowed even when 
the full noun in -zione displays some referential reading. The examples in (5) illustrate. 


(5) a. Lariunificazione delle due Germanie é stata un processo complesso — E 


b. *Lariunifica delle due Germanie è stata un processo complesso — "E 
"Ihe reunification of the two Germanies was a complex process' 


c. Questo sedimento è la stratificazione di rocce diverse — R 


d. * Questo sedimento é la stratifica di rocce diverse — *R* 


“This sediment is a (result of the) stratification of various rocks’ 


In what follows, I shall concentrate precisely on these two groups where we find, on 
the one hand, some attested pairs of full vs. clipped formations with presumably slightly 
different semantics, and, on the other hand, unattested, yet possible or impossible clipped 
forms. To begin with, I posit that what the two clipping rules, in French and in Italian, 
respectively, seem to have in common is a sort of (partial) elimination of event reading 
of the deverbal noun in favour of a salient referential interpretation. At the same time, a 
specific semantic condition on the kind of object (i.e. the type of referential reading) is 
required for the rule in question. In the next sections, after first considering some French 
and - in more detail - Italian examples, I will argue that a special typology of result 
nominals (elaborated by Melloni 2011) is needed in order to account for the phenomena 
in question. I intend to show that a lexical semantic typology of the base verbs will be 
able to predict, to a large extent, the possibility of clipping. 


3 Clipped deverbal nominals in French 


In this section, I briefly review the French data, taken from the literature, focusing on the 
general condition for the clipping rule, which will turn out to be useful in the discussion 
of the Italian examples as well. 


^I owe this example to Fabio Montermini. 
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In French, the clipping rule, as far as deverbal nouns with the suffix -tion are con- 
cerned, may apply to a number of formations.? When clipped, the noun receives a spe- 
cial result/object reading although some aspects of event interpretation are maintained. 
The clipped nouns thus become similar to simple event nouns. The internal arguments of 
the base verb are, in such a formation, excluded (see Kerleroux 1997: 171): 


(6) a. La manifestation de la vérité aura pris cinquante ans — E 


b. * La manif de la vérité aura pris cinquante ans — E 
"Ihe demonstration of the truth will have taken fifty years' 


c. La manifestation (des étudiants) a duré cinq heures — E 


d. La manif (des étudiants) a duré cinq heures — E 
"Ihe demonstration (of the students) took five hours' 


According to Kerleroux (1997: 155), already cited above, the difference lies precisely in 
the complex / simple event dichotomy. Complex event nominals, which maintain their 
internal argument structure, cannot undergo clipping, whilst in the case of simple event 
nouns, such as manifestation in the sense of ‘demonstration’, clipping is allowed. 

In the following example, the possibility of clipping is limited to a more concrete (and 
not eventive) interpretation of ‘introduction’, that of information-object.$ This notion will 
be of great importance in the discussion of the Italian data. 


(7 a L'introduction du lynx dans le massif du Vercors par les responsables de 
l'ONF—E 
b. *L'intro du lynx dans le massif du Vercors par les responsables de l'ONF — 
"E 
"Ihe introduction of the lynx into Vercors Massif by the authorities of the 
ONF (National Forest Office) 


c. L'introduction (de ton livre) compte quatre pages —^ R 


d. L’intro (de ton livre) compte quatre pages — R 
"Ihe introduction (of your book) has four pages' 


The important point is that clipping in French does not seem to eliminate eventive 
readings altogether. In the case of event nouns, the difference between pure transposi- 
tions (complex event nominals) and what we might call “names of specific events" is 
relevant. Indeed, as Fradin states, the condition on clipping seems to be that 


5] deliberately leave aside the general context for truncation which, in French, is not limited to complex 
words (having as its target only the suffix) but may be applied to a wide range of bases, such as documen- 
tation — doc, information — info, actualité — actu, etc. As Montermini & Thornton (2014: 183) point out, in 
cases where the truncated material coincides with the suffix (e.g., invitation — invite), the coincidence is to 
be taken as purely fortuitous. 

$As Fabio Montermini notes (p.c.), such an information-object feature does not prevent, in principle, an 
event-based reading, as witnessed by the acceptability of l'intro de son discours a duré une heure, where 
discours ‘speech’, being a simple event noun, enables clipping. 
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(...) d'une maniére générale, ne peuvent étre accourcies que des expressions nomi- 
nales fonctionnant comme des dénominations (names) d'entités diverses (individu, 
objet, comportement...). [Generally, what can be shortened are the expressions 
functioning as denominations, names of various entities such as individuals, ob- 
jects, behaviour]. (Fradin 2003: 250) 


Inow turn to the Italian data in order to see further semantic constraints on what kind 
of entities these generally need to be for clipping to take place. 


4 Clipped deverbal nominals in Italian 


According to Thornton (1990, 2004: 519), the Italian shortened forms are to be taken 
simply as stylistic variants of their corresponding full nominals. Furthermore, as Mon- 
termini & Thornton (2014: 193-194) show on the basis of corpus frequency, many short- 
ened forms (especially those in -ifica) have by now become far more frequent than their 
full counterparts. 

Stichauer (2006) proposes, as already mentioned above, to distinguish three groups 
of such clipped nominals that behave differently with respect to the original deverbal 
nouns with the suffix -zione. 

The first group comprises the pairs such as modificazione-modifica (3) or verificazione- 
verifica (8) in which the clipped form has already assumed the same syntactic distribu- 
tion; moreover, in this case of verificazione/verifica, the clipped form is far more accept- 
able because of its increasing frequency of use. 


(8) a. La verificazione della teoria (da parte degli scienziati) è stata affrettata — E 

b. La verifica della teoria (da parte degli scienziati) è stata affrettata — E 

"Ihe verification of the theory (by the scholars) was hasty' 
c. La verificazione (della teoria) va pubblicata su una rivista importante — R 
d. La verifica (della teoria) va pubblicata su una rivista importante — R 

"Ihe verification (of the theory) is to be published in an important journal 
e. La verificazione (della teoria) é sul tavolo — R 
f. La verifica (della teoria) è sul tavolo — R 

"Ihe verification is on the table' 


In the second group of formations we should take into consideration cases in which, 
on the contrary, we find a shortened form that has a specialized meaning with respect 
to the noun in -zione, e.g. permutazione - permuta. While the former noun is a normal 
event nominal, the latter refers to a specialized type of property exchange." (9): 


TMontermini & Thornton (2014: 196-198) rectify Stichauer's (2006: 33) incorrect claim about the loss of a 
transpositional relation between the verb permutare and permuta. In fact, permuta clearly functions as an 
event noun being thus similar to the relation between, say, the French verb manifester with respect to 
manifestation and manif. Moreover, Montermini & Thornton (2014: 198) suggest that permuta is to be taken 
as a converted form rather than a clipped formation. 

*The examples are taken from the corpus La Repubblica and slightly modified. 
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(9) a. Questo poemetto (...) si fonda sulla permutazione dei ruoli tra l'uomo e l'ani- 
male 


b. Questo poemetto (...) si fonda sulla *permuta? dei ruoli tra l'uomo e l'animale 
"Ihis short poem is based on the permutation of roles between man and ani- 
mal’ 

c. Che dire poi di coloro che cedono la propria auto in permuta? 


Che dire poi di coloro che cedono la propria auto in *permutazione? 
"What can we say then about those who trade in their cars?' 


Finally, the third group of nouns would be the one in which clipping is impossible. Al- 
though this question is not directly addressed in the literature, I maintain that it is inter- 
esting to uncover the constraints that seem to regulate the possibility or impossibility 
of a hypothetical nonce-formation. In fact, if only stylistic constraints were at work, 
we should find many more examples in various administrative texts than we actually en- 
counter. Moreover, if only such diaphasic differences were responsible for the clipping 
rule, many a nonce-formation, e.g. la continua desertificazione del pianeta — la continua 
"desertifica del pianeta (‘the continuous desertification of the planet’), might become ac- 
ceptable under specific stylistic circumstances. However, this does not seem to be the 
case. 

I will limit my analysis to a narrow sample of nouns in -ificazione that seem to be the 
most frequent deverbal nominals that might, under specific conditions to be stated below, 
undergo clipping of the suffix -zione. For the present, I will assume that where clipping 
is allowed, a special result/object denotation is required or imposed by the mechanism 
in question; at the same time, the complex or simple event reading is, in some cases, 
partially eliminated. 

I shall consider the following six examples: riunificazione, mercificazione, reificazione, 
quantificazione, giustificazione and falsificazione. I will employ roughly the same “diag- 
nostic" contexts also used by Melloni 2011. This step is obviously problematic for the 
simple reason that the diagnostic contexts do not always yield an entirely natural ex- 
ample, attested or "attestable" in the corpora. I attempt to remedy this shortcoming 
by modifying or integrating the examples according to real data present in the corpus 
CORIS/CODIS!, La Repubblica,’ or on the Internet (by a general search on google.it). 
When necessary, I also add a clarifying footnote (especially when native speakers' judge- 
ments tend to give variable results). 


?In fact, web search on google.it (http://www.ilcovile.it/news/archivio/00000420.html) provides one exam- 
ple of the shortened form permuta in precisely this context. The sequence permuta dei ruoli can be found 
in the Italian translation of Jankélévitch's book Le Paradoxe de la morale. 

10For instance, in the corpus of La Repubblica (330 million tokens), we find about 150 different types in 
-ificazione, and about 90 forms ending in -ifica, where after careful post-processing, about a dozen for- 
mations remain and virtually no hapax qualifying as a real neologism can be found (la chiarifica being 
probably the only exception). 

1 Accessible at: http://corpora.dslo.unibo.it/TCORIS/. Accessed September-October, 2016. 

Accessible at: http;//dev.sslmit.unibo.it/corpora/corpus.php?path-&name-Repubblica. Accessed Sep- 
tember-October, 2016. 
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I begin with riunificazione. In (10), we see that the only available reading is that of 
an event, all possible result/referential readings are excluded simply because riunificare 
does not belong to any product-oriented verbs (in the sense of Melloni 2011: 184 ff.): 


(10) a. La riunificazione delle due Germanie ha richiesto molto tempo — E 


b. * La riunifica delle due Germanie ha richiesto molto tempo — "E 
"Ihe reunification of the two Germanies took a long time' 


c. *Lariunificazione è falsa — "H 


d. *Lariunifica è falsa — "H 
(intended) "Ihe reunification is false' 
e. *Lariunificazione é sul tavolo — *R 


f. *Lariunifica é sul tavolo — *R 
(intended) "Ihe reunification is on the table' 


In the case of mercificazione (‘commodification’) we find essentially the same situation. 


(11) a. Questo processo di (continua) mercificazione del corpo femminile — E 

b. * Questo processo di (continua) mercifica del corpo femminile — E 
‘This process of (continuous) commodification of the female body’ 

c. *Le presenti mercificazioni del corpo femminile non sono affatto belle — "H 

d. *Le presenti mercifiche del corpo femminile non sono affatto belle — *R 
(intended) “The present commodifications of the female body are not nice at 
all’ 

e. *La mercificazione é sul tavolo — *R 

f. *La mercifica è sul tavolo — "H 


‘The commodification is on the table’ 


It could be argued, however, that the verb mercificare is semantically close to verbs 
of creation (by modification). The impossibility of having an R-reading might be due to 
the same reasons for which edificazione from edificare, as a typical creation verb, does 
not display any result/object interpretation. Melloni (2011: 189) suggests that a possible 
R-interpretation is blocked by the existing lexeme edificio. 

Analogous behaviour is also exhibited by reificazione (12) (‘reification’), which is ac- 
ceptable only in the eventive reading. 


(12) a. Le osservazioni di L. C. sulla (costante) reificazione dei bambini meritano... 
—E 


b. Le osservazioni di L. C. sulla (costante) “reifica dei bambini meritano... — "E 
‘L. C’s remarks on the (constant) reification of children deserve 
c. *La reificazione é interessante — "R 


d. *Lareifica è interessante — *R 
(intended) “The reification is interesting’ 


e. *Lareificazione è sul tavolo — "H 
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f. *Lareifica è sul tavolo — "H 
(intended) “The reification is on the table’ 


In the nouns in (10-12) we thus find that the only possible interpretation is the one 
associated with event nominals, the result reading of the construction-type nouns being 
ruled out. Arguably, the absence of such a result/object aspect is the factor that does not 
allow for further clipping of the formation. Indeed, the result/object reading seems to be 
a necessary, albeit not sufficient, condition. As we will see in the examples below (13-17), 
clipping seems to be sensitive to a special result/object meaning which is very close to 
what Pustejovsky (1991: 164; see Melloni 2011: 109, 111, 142) calls information object. It thus 
appears that where there is such an information object reading available to the relevant 
nominal, the clipping rule may apply. 

I now pass to the discussion of such nouns. I start with quantificazione. In example 
(13b), we can see that the shortened form is less acceptable in the eventive reading. 
The referential reading - conveying an information-object interpretation — allows for 
clipping giving rise to a possible nonce-formation ‘la quantifica." 


(13 a. La quantificazione dei costi deve essere effettuata al più presto — E 


b. ?* La quantifica dei costi deve essere effettuata al più presto — ?*E 
"Ihe quantification of the costs must be carried out immediately' 


© 


La quantificazione (dei costi) contiene un errore — R 


d. ° La quantifica (dei costi) contiene un errore — R 
"Ihe quantification (of the costs) contains an error' 


e. La quantificazione è sul tavolo — R 


ES 


° La quantifica è sul tavolo — R 
"Ihe quantification is on the table’ 


I argue that the pair giustificazione / giustifica, seen above in example (4), repeated here 
as (14), shows essentially the same behaviour despite Montermini & Thornton's (2014: 
192) claim about its total synonymy. 


(14 a. Lefrequenti giustificazioni dell'assenza (da parte degli studenti) sono intoller- 
abili — E 

b. *Le frequenti giustifiche dell'assenza (da parte degli studenti) sono intollera- 
bili — *E 

‘The frequent justifications for absence (on the part of the students) are in- 
tolerable’ 


DSome speakers tend to accept the shortened form even in this eventive context (Fabio Montermini finds 
it totally acceptable without perceiving any difference whatsoever). Thus, it would be necessary to see 
whether all possible eventive contexts, offered below for giustifica, would equally yield a more or less 
acceptable formation. The corpora offer no example. However, an internet search conducted in July 2017 
found 7 hits, including an example where the author puts the formation within quotation marks in order to 
signal its peculiar (neological?) status: Secondo me é una discreta opportunità di lavoro con contratto biennale, 
ma ho bisogno di una “quantifica” dei costi che io non so proprio fare. 

HT follow here Corbin’s (1987) use of the ° sign to mark possible, yet unattested formations. However, as we 
have seen, the formation quantifica is modestly attested (albeit to a very limited extent). 
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c. La giustificazione dell’assenza é falsa — R 


d. La giustifica dell’assenza è falsa — R 
"Ihe justification for absence is false' 


e. La giustificazione è sul tavolo — R 


La giustifica è sul tavolo — R 
"Ihe justification is on the table’ 


The example thus deserves more discussion. Montermini & Thornton (2014: 192) claim 
that giustificazione and giustifica are absolutely synonymous (differing only in the reg- 
ister, the latter being typical of a school jargon). To support this apparently indubitable 
fact, they adduce not only their native speaker judgements but also some corpus evi- 
dence, such as the (fixed) sequence libretto delle giustificazioni / libretto delle giustifiche 
which appears in a large number of official school rules and regulations. However, I 
argue that the synonymy of this pair is limited to just the referential reading where, in- 
deed, the two formations are wholly interchangeable. Yet, in the eventive readings, the 
synonymy is far less obvious. 

First, as shown above in examples (14a, 14b), if subjected to different tests of actionality, 
the form giustifica turns out to be ruled out. Drawing (loosely) on Anscombre's (1986) 
tests of actionality, I point out that the following constructions highlight the problems 
at hand. 


(15) a. Gli studenti hanno sempre trovato un metodo di giustificazione /"giustifica 

delle loro assenze 
"Ihe students have always found a method of justification of their absences' 

b. In caso di mancata giustificazione / ?? giustifica dell'assenza da parte degli 
alunni, verrà attivata un'azione disciplinare 15 
‘Failure on the part of the student to provide justification of the absence may 
result in disciplinary action’ 

c. Ora procediamo alla giustificazione/*giustifica delle assenze 
‘Now let’s move on to justifying the absences’ 

d. Non si può accettare una giustificazione/*giustifica così frettolosa 
‘It’s not possible to accept such a hasty justification’ 


What I stress is that the clipped form, displaying a clear information-object meaning 
(la giustifica is primarily a written document), is far less acceptable in all eventive read- 
ings enhanced by the constructions of the type seen in (15). I argue that such a semantic 
condition, though being probably just a slight tendency, can be best seen in the example 
of falsificazione. The underlying verb, falsificare, can have two meanings, a material one 
of falsificare la moneta, la carta di credito etc. (to falsify the money, the credit card) and 
a Popperian sense of falsificare un’ipotesi (to falsify a hypothesis). When falsification is 


For some speakers, in fact, giustifica is acceptable even in this dynamic reading, while for others it tends 
to be ruled out. 
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understood in the “material” sense, clipping seems to be ruled out (16), but when it comes 
to the other meaning, an information-object reading appears to be more acceptable (17) 
given that the falsification of a hypothesis may in fact be a written document. 


(16) a. La falsificazione delle carte di credito (da parte di alcune persone) è sempre 
stata facile — E 


b. * La falsifica delle carte di credito (da parte di alcune persone) è sempre stata 
facile — "E 
"Ihe falsification of the credit cards (by some people) was always easy’ 

c. Questa carta di credito è una falsificazione — *R 

d. * Questa carta di credito è una falsifica — "H 


“This credit card is a falsification’ 
e. La falsificazione (della carta di credito) è sul tavolo — *R 


* La falsifica (della carta di credito) è sul tavolo — *R 
"Ihe falsification (of the credit card) is on the table’ 


ec? 


(17 a. La falsificazione di quella ipotesi (da parte dello studioso) non ha richiesto 
molto tempo — E 


b. * La falsifica di quella ipotesi (da parte dello studioso) non ha richiesto molto 
tempo — "E 
"Ihe falsification of that hypothesis (by the scholar) didn't take much time' 
c. La falsificazione (di quella ipotesi) è geniale — R 
d. ° La falsifica (di quella ipotesi) è geniale —^ R 


"Ihe falsification (of that hypothesis) is brilliant’ 
e. La falsificazione è sul tavolo — R 


f. ° La falsifica è sul tavolo — R 
"[he falsification is on the table' 


What the two contexts have in common is a possibility of having a result-object inter- 
pretation. But while in (16c-16f) the referential reading is more “material”, in (17c-17f), 
the information-object reading of falsificazione strongly favours the acceptability of the 
clipped variant falsifica (see also Montermini & Thornton 2014: 196, note 16 on falsi- 
fica). I take this case, along with the others discussed above, as an example of Frazdin's 
hypothesis that hypothesis according to which 


[...] un procédé dérivationnel donné opére de maniére discriminante sur l'une ou 
l'autre de ces significations. [a given derivational process operates differentially on 
one or the other of these meanings.] (Fradin & Kerleroux 2009: 86) 


16The form falsifica is in fact attested on the Internet only a couple of times, in a context that seems to be due 
to strong analogy with verifica: “...sostituire alle procedure rigorose di verifica e falsifica di proposizioni 


scientifico-sperimentali un metodo simile a quello storico-comprensivo...”; “...i dati sperimentali sono il 
fondamento della verifica/falsifica di ogni ipotesi scientifica...”; “...isolare e selezionare quei fatti, e quei 
modi di viverli, che consentono la verifica (o falsifica) di date ipotesi...”; ^...Gli epistemologi hanno cosi 


iniziato a riflettere e a cercare situazioni di verifica o di falsifica di queste ipotesi...” 
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5 Concluding remarks 


On the basis of the data so far analyzed — which represent only a very limited sample - 
I now conclude by summarizing my main proposal. 

I maintain that the clipping rule is sensitive to the information-object meaning of the 
construction in -zione. Such an information-object meaning can be predicted from the 
general semantics of the base verb. 

What Melloni (2011: 108) considers to be the core meaning of what she calls the R 
nominals may be captured in the following four or five classes based on the semantics 
of the underlying verb: the product, means, path and measure, entity in state verbs and 
the sense extensions. She shows that inside of the product-oriented nominals a further 
division is to be made between creation/result-object verbs (such as costruire), creation- 
by-representation verbs (such as tradurre) and creation-by-modification verbs (such as cor- 
reggere). The representation (and also modification) class of creation verbs can, as Melloni 
puts it 


[...] undergo a metonymic displacement and convey the concrete interpretation of 
its container object, (a piece of paper, for instance) [...]. (Melloni 2011: 201) 


Furthermore, still inside this class of creation verbs, there is another non-prototypical 
group of speech act verbs (see Melloni 2011: 213-214) which convey a proposition that 
can be, once again, understood as information object à la Pustejovsky (1991), as, for ex- 
ample, confessione, communicazione etc. In such a perspective, we could also reconsider 
the already lexicalized nouns as, for instance, condanna, confisca, deroga, proroga, ratifica, 
nomina etc. (see Thornton 2004: 519). But this is, of course, a matter of future research. 
For the present, I only wished to show that a general information-object meaning can 
indeed be a relevant factor in a (marginal) process of clipping of the Italian nouns in 
-ificazione. 
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This paper deals with the role played by the notion of a lexeme in a constraint-based lexicalist 
theory of grammar such as Head-driven Phrase Structure Grammar. Adopting a Word and 
Paradigm view of inflection, we show how the distinction between lexemes, individuated 
by their lexical semantics, and flexemes, individuated by their inflectional paradigm, can 
fruitfully be integrated in such a framework. This allows us to present an integrated analysis 
of stem spaces, inflection classes, heteroclisis and overabundance. 


It is often observed by morphologists that contemporary work in theoretical mor- 
phology has little impact on formal theories of grammar, which on average are content 
with a view of morphology quite close to that of offered by the post-Bloomfieldian mor- 
phemic toolkit. A notable exception to this situation is the pervasive use in Head-driven 
Phrase Structure Grammar (henceforth HPSG) of the distinction between wonps and 
LEXEMES familiar from Word and Paradigm approaches to morphology (see among many 
others Robins 1959, Hockett 1967, Matthews 1972, Zwicky 1985, Anderson 1992, Aronoff 
1994, Stump 2001, Blevins 2016). In this paper we reevaluate the role of the lexeme in 
HPSG in the light of 20 years of research, and in particular of recent attempts to inte- 
grate a truly realisational theory of inflection within the HPSG framework (Crysmann 
& Bonami 2016). We conclude that current theorizing conflates two distinct notions of 
an abstract lexical object: lexemes, which are characterised in terms of their syntax and 
semantics, and FLEXEMES (Fradin & Kerleroux 2003), which are characterised in terms 
of their inflectional paradigm. We propose distinct formal representations for lexemes 
and flexemes, and explore the benefits of the distinction for a formally explicit theory of 
morphology and the morphology-syntax interface. 

The structure of the paper is as follows. In Section 1, we present the standard view of 
the lexeme in contemporary HPSG, and show that lexemes are given a dual representa- 
tion, as a distinct type of signs and as the value of the feature Lip. In Section 2, we present 
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Information-based Morphology (IbM),! an HPSG-compatible realisational approach to 
inflection, and show that lexemes-as-signs have no role to play in an HPSG using IbM 
as its inflectional component. In Section 3 we discuss Fradin and Kerleroux’s distinction 
between lexemes and flexemes, and argue that this should be encoded by distinguish- 
ing a feature LID and the values it can take from pid objects: while the former reside 
in syntactic/semantic representations, the latter are found in inflection proper. Finally 
in section 4 we discuss the consequences of the distinction between Lip and pid for the 
modelling of heteroclisis and overabundance. 


1 The lexeme in standard HPSG 


1.1 Lexemes as a distinct type of lexical signs 


Most current work in Head-driven Phrase Structure Grammar (henceforth HPSG; Pol- 
lard & Sag 1994, Ginzburg & Sag 2000, Sag et al. 2003) and its variant Sign-Based Con- 
struction Grammar (henceforth SBCG; Boas & Sag 2012) embraces the notion of a lex- 
eme, familiar from Word-and-Paradigm approaches to morphology. Under this view, a 
lexeme is an abstract lexical object encapsulating what is common to the collection of 
words belonging to the same inflectional paradigm. Although the details are complex 
and disputed, it is uncontroversial enough to assume that a lexeme may be comprised 
of some amount of phonological information (in the form of a stem, a collection of stem 
alternants, a consonantal pattern, etc.), morphological information (e.g. inflection class 
information), syntactic information (at the very least part of speech and valence infor- 
mation), and semantic information corresponding to a notion of lexical meaning’ (plus 
linking of semantic roles to syntactic dependents). Inflection is then concerned with the 
relation between (abstract) lexemes and (concrete) words,” while ‘word formation’, more 
adequately called LEXEME FORMATION (Aronoff 1994), is concerned with morphological 
relations between lexemes. 

Since the late 1990s a growing consensus has emerged within HPSG that lexemes 
should be treated as signs on a par with words.? That is, the hierarchy of linguistic objects 
includes the subhierarchy in Figure 1. Syntactic rules may form phrases by combining 
signs of type syn-sign, while rules of morphology manipulate only signs of type lex-sign. 

This is intended to implement the notion of strong lexicalism. First, words constitute 
the interface of morphology and syntax, since they belong to both types. Second, mor- 
phology and syntax are discrete components of grammar inasmuch as some aspects of 


1The framework is presented and elaborated in Bonami & Crysmann (2013, 2016), Crysmann (2017), Crys- 
mann & Bonami (2016). The name is intended as a reference to Pollard & Sag's (1987) Information-based 
Syntax and Semantics. In IbM, the notion of information in the sense of feature logic plays a central role in 
determining morphological wellformedness, defined in terms of exhaustive expression of morphosyntactic 
properties. Furthermore, IbM implements Paninian competition on the basis of subsumption, a measure of 
informativity in feature logic. 

? Alternatively, within an abstractive conceptualisation of morphology (Blevins 2006), where words are seen 
as primitives rather than derived objects, inflection is concerned with the relation between words in a 
paradigm, and the abstract notion of a lexeme captures what is common between these words. 

See Bonami & Crysmann (2016) for a thorough overview of work on morphology in HPSG. 
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sign 
syn-sign lex-sign 


phrase word lexeme 


Figure 1: A standard HPSG subhierarchy of signs 


the feature geometry of signs will be specific to phrases or lexemes; likewise, this ar- 
chitecture allows for the possibility that the kind of combinatory rules relating phrases 
to their component parts be very different from the kind of combinatory rules relating 
words to their component parts. 

Although this is by no means an obligation, as we will see below, standard practice in 
HPSG and SBCG in the past two decades has been to assume an Item and Process view 
of morphology (Orgun 1996, Riehemann 1998, Koenig 1999, Müller 2002, Sag et al. 2003, 
Sag 2012), where the word-lexeme opposition captures the difference between inflection 
and lexeme formation. Rules of inflection map a lexeme to a word, rules of derivation 
map a lexeme to a lexeme, rules of composition map two lexemes to a lexeme. The three 
toy rules in Figures 2, 3 and 4 illustrate the basic architecture. 


word 
PHON 1]+/z/ 
HEAD noun 
SYNSEM [2 
NUM pl 
lexeme 
M-DTRS PHON 1 
SYNSEM [2 


Figure 2: Simplified rule of regular English plural formation. 


lexeme 


PHON ijo] 


SYNSEM HEAD noun] 


lexeme 


M-DTRS {PHON D 
ss|HD verb 


Figure 3: Simplified rule of English Agent noun formation. 


Formally, morphological rules are modeled on a par with phrase-structure rules, ex- 
cept for the fact that, in inflectional and derivational rules, the relation between the 
phonology of the mother (the output lexical sign) and the phonology of the daughter 
(the input lexical sign) is specified syncategorematically: affixes are not signs, but bits 
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lexeme 


PHON 1142 


SYNSEM HEAD noun] 


lexeme lexeme 


M-DTRS PHON U PHON [2 


salon noun||ss|HD noun 


Figure 4: Simplified rule of English noun-noun compound formation 


of phonology added by rule.* The main difference between inflection and lexeme forma- 
tion rules lies in the fact that inflection does not modify the svNsEM value, but merely 
expresses some of its aspects. The main specificity of composition is that the input (the 
daughter signs) consists of two lexemes rather than one. Figures 5 and 6 illustrate typical 
morphological analyses within such a framework. 


word 
PH favoz/ 
ss|HD 1 [Nom D 
lexeme 
PH /lavæ/ 
ss|HD 1| noun 
M-DTRS 
lexeme 
M-DTRS (PH /lav/ 
ss|HD verb 


Figure 5: Analysis of the noun lovers under an Item-and-Process view of mor- 


phology 


1.2 Lexeme identifiers 


Itis sometimes necessary for a lexical entry or syntactic construction to be able to select a 
particular lexical item in its environment. One clear case of this is that of flexible idioms. 
Consider the idiom pull strings ‘try something’. As the examples in (1) make clear, while 
the idiomatic meaning is present only when the object of pull is headed by the lexeme 
strings, the noun may occur in either singular or plural form, and combine with a variety 
of determiners and modifiers (Bargmann forthcoming). 


(1) a. There I learned whom [sic] my secret advocate was, the man who had pulled 
strings to get me the teaching job in the midst of a terrible economy, and who 


‘For a dissenting view see Emerson & Copestake (2015). 
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word ] 
PH /bsdwatfaz/ 
selon 1 [Nom D 
lexeme ] 
PH /bsdwatfa/ 
ss|up 1| noun 
lexeme 
M-DTRS PH /watfa/ 
lexeme ss|HD 1] noun 
M-DTRS (PH /bád/ |. 
lexeme 
ss|HD noun 
M-DTRS (PH /watf/ 
ss|HD verb 


Figure 6: Analysis of the noun birdwatchers under an Item-and-Process view 
of morphology 


had pulled more strings to allow me to keep it, and who had then pulled even 
more strings to have my commission assigned to the Abwehr? 


b. You'll never know the trouble I had, and the strings I had to pull to get you 
back from Berlin. 


c. We have to remember that Jacob was at their wedding. Just how many strings 


did he pull?’ 
d. So I didn't pull any string. Didn't need to.? 


e. When I got the job, I thought to myself, "Someone upstairs finally pulled a 
string for me"? 


f. No string was pulled, it was based on merit. 


This type of situation motivated the introduction of the feature LID (or LEXEME IDENTI- 
FIER) as a head feature projecting to phrasal level information as to which lexeme heads 
a phrase (Sag 2007, 2012).!! Simplifying matters considerably, one can see the construc- 
tions above as licensed by the two idiomatic lexical entries in Figure 8, which contrast 
with the two ordinary entries in Figure 7: a special lexical entry of pull with idiomatic 
meaning selects specifically for an object headed by a form of strings with idiomatic 
meaning. The postulation of a specific rip value for idiomatic string allows idiomatic pull 


5K. Ryan, The Somnambulist, New York: iUniverse, 2006. 

5K. McDermott, The time of the corncrake, Victoria: Trafford, 2004. 
Thttp://www.losttv-forum.com/forum/showthread.php?t=65542. Accessed on November 26, 2016. 
5http://www.justusboys.com/forum/archive/index.php/t-437037.html. Accessed on November 26, 2016. 


?http://ultraphrenia.com/2016/10/02/a-cigarette-break-behind-heavens-gate. Accessed on November 13, 
2016. 


V http://obafemayor02.blogspot.fr/2013 03 24 archive.html. Accessed on November 26, 2016. 
Note that a very similar role is played by the feature LISTEME in Soehn (2006) and Richter & Sailer (2010). 
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to select for a specific combination of an inflectional paradigm with an idiomatic mean- 
ing, while abstracting away from inflectional and syntactic variability in the makeup of 


the object of pull. 


lexeme 


PHON /pul/ 
lexeme 


HEAD [un pull-ti] PON: disti]! 


var — <NP;,NP)) HEAD [un string-lid 


pull-r S CONT  string-rel 


CONT |ACT 1 


UND j 


Figure 7: Ordinary lexical entries for pull and strings 


lexeme 


PHON /pol/ 
lexeme 


HEAD [up idiomatic-pull-lid| PHON /stmp/ 


VAL <NP;,NP,[L1D idiomatic-string-lid]) sean [ub idiomatic-string-lid 


use-rel 
CONT |ACT i 


UND j 


cont _ influence-rel 


Figure 8: Idiomatic lexical entries for pull and strings 


The feature LID provides a useful mechanism for spreading lexical information in syn- 
tactic structures that has been used since in the analysis of complex predicates (Miller 
2010) and periphrastic inflection (Bonami & Webelhuth 2012, Bonami & Samvelian 2015, 
Bonami 2015, Bonami et al. 2016). It also provides a direct encoding of lexemic identity. 
Since LID is a head feature, and inflected words share the HEAD value of the lexeme they 
are derived from, all inflected forms of a lexeme will have the same Lip. Under the natural 
assumption that all lexemes have a distinct LID value, whether two words instantiate the 
same lexeme can thus be deduced by inspection of their rip values, without examining 
their derivation history. 


2 The lexeme in a Word and Paradigm version of HPSG 


2.44 Going Word and Paradigm 


While an Item and Process view of morphology has been dominant in the HPSG litera- 
ture, over the last 20 years a number of authors have become more vocal in advocating 
the incorporation into HPSG of a Word and Paradigm view of inflection (see among 
others Erjavec 1994, Miller & Sag 1997, Ackerman & Webelhuth 1998, Crysmann 2002, 
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Bonami & Boyé 2006, Bonami & Webelhuth 2012, Bonami 2015, Bonami & Samvelian 
2015, Crysmann & Bonami 2016). Under such a view, rules of inflection do not incre- 
mentally specify how a basic sign is augmented with morphosyntactic information and 
phonological exponents; rather, a full morphosyntactic specification of the word is given 
as input to a system of rules of exponence indicating how such a specification is partially 
realised by exponents in various positions with respect to the basic stem. The arguments 
in favour of such a move are the usual ones (Matthews 1974, Zwicky 1985, Anderson 
1992, Stump 2001, Brown & Evans 2012): systems of exponence depart too strongly from 
a one-to-one correspondence between form and content for the Item and Process view to 
make sense in the general case. We will not rehearse these arguments here, and simply 
make the sociological observation that Word and Paradigm approaches have over the 
last two decades become the de facto standard for theoretical and typological reasoning 
on inflection systems. 

Recent attempts at implementing Word and Paradigm inflection in HPSG come in 
two flavors. One the one hand, Bonami & Webelhuth (2012), Bonami (2015), Bonami 
& Samvelian (2015) explicitly interface Paradigm Function Morphology (Stump 2001, 
2016) with HPSG through a set of relational constraints. On the other hand, Crysmann 
& Bonami (2016) design a realisational framework for inflection native to the HPSG ar- 
chitecture, Information-based Morphology (IbM), making heavy use of the underspecifi- 
cation techniques provided by a typed feature structure formalism. 

Figure 9 illustrates the main features of IbM by way of the analysis of a rather simple 
inflected word, the French verb buvions ‘we drank’. IbM specifies the inflectional system 
of a language as a set of constraints relating a words syNsEM value to its PHONology. 
In the present example, a word realising the past imperfective of the verb BOIRE in the 
context of a 1PL subject is constrained to have the string /byvj5/ as its phonological re- 
alisation. The specification of these constraints makes use of three intermediate, strictly 
morphological, representations. The feature Ms (standing for ‘morphosyntactic proper- 
ties’) encodes those syntactic and semantic properties of the word that are relevant to in- 
flection, in a format suitable for the expression of constraints on exponence. The feature 
MPH (standing for ‘morphs’) indicates the set of morphs making up the word, indexed 
for their position within the word (Pc, standing for ‘position class’). Finally, the feature 
RR (standing for ‘realisation rules’) indicates which generalisations on the relationship 
between morphosyntactic properties and morphs license the particular association be- 
tween form and content instantiated in that word. Importantly, realisation rules relate 
a set of morphosyntactic properties (listed under MUD, standing for ‘morphology under 
discussion’) to a set of morphs (listed under wPH). Thus, while in this simple example, 
there is a one-to-one mapping between properties and morphs, IbM realisation rules can 
just as easily accommodate cumulative exponence (m properties : 1 morph), extended 
exponence (1 property : n morphs), overlapping exponence (m properties : n morphs), 
and zero exponence (m properties : 0 morphs). 

The relationship between the various features is regulated by a set of general principles 
that we will only state in prose here; we refer the reader to Bonami & Crysmann (2013) or 
Crysmann & Bonami (2016) for a more explicit formulation. Let us start with the relation- 
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word 
PHON  /byvj5/ 
PH /byv/| mp /j/| |PH /3/ 
er PC 0 "Jpc 2]|'|pc 3 
— PH [1] /byv/ PH /j/ MPH PH /5/ 
Pc 0 MPH i6 22 PC 3 
RR : 
drink-lid TNS pst subj 
MUD ||LI MUD . MUD {|PER 1 
STEMS (1]..) ASP  ipfv 
NUM pl 
E subj 
drink-lid TNS pst 
2 s Jbyvibwavi/bwah|| Asp ip| RT 
stems </byv/,/bwav/,/bwa/) Ip, NUM pl 
up drink-lid 
HEAD TNS pst 
SYNSEM asp ipfv 
ARG-ST (NP,,, NP» 


Figure 9: A sample IbM analysis: the French 1PFv.2PL word buvions ‘we drank’ 


ship between the syNsEM and Ms values. This is regulated by a set of language-specific 
constraints, since which aspects of syntax and semantics are realised by inflection is a 
highly parochial matter. Two features of this interface are worth noting. First, lexeme- 
specific information on inflection class and stem alternants is included in Ms inside the 
lid value. In particular, a list-valued feature srEMs provides an indexed set of stem alter- 
nants, also known as a stem space (Bonami & Bové 2006).!* The choice of a particular 
stem is then effected by a realisation rule of stem selection (Stump 2001), picking out the 
appropriate value in this list, depending on the morphosyntactic context; In the present 
instance, the default of picking the first stem applies. In other words, in IbM, even the 
stem is taken to be the realisation of some word-level information, namely lexical iden- 
tity. Second, Ms values are relatively flat in comparison to SYNSEM values, consisting of a 
set of small feature structures, rather than one large, deeply recursive feature structure. 
This is necessitated by the different demands of morphological and syntactic combina- 


tion.’ 


12 Bonami & Bové (2006) argue that the French stem space has 12 coordinates. for simplicity we show only 3 
in the example in Figure 9. 

B The distinction between SYNSEM and MORSYN may also be used to account for mismatches between content 
and form at the morphology-syntax interface, as variously captured in the literature by distinguishing 
syntactic and morphological features (Sadler & Spencer 2001, Corbett & Baerman 2006, Bonami 2015) or 
content and form paradigms (Stump 2006, 2016). 
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We may now turn to the relationship between Ms and RR. This is regulated by a prin- 
ciple of morphological wellformedness: the ms value of a word must be identical to the 
disjoint union of the mups of the realisation rules. In other words, each morphosyntactic 
property must be realised by exactly one rule, although a single rule may realise multiple 
properties at once. 

Finally, the relation between RR, MPH and PHON is rather straightforward. First, the 
MPH value of a word is the union of the MPH values of its realisation rules: in other 
words, every morph must be licensed by at least one realisation rule, although a reali- 
sation rule may license more than one morph (extended or overlapping exponence), or 
even no morph at all (zero exponence). Second, a word’s phonology is determined by ap- 
pending the phonology of its morphs in accordance with the linear sequence of position 
class indices. Note that, although the system of position class indices encodes the notion 
of a morphotactic template, it does so with appropriate flexibility. There is no notion of 
an ‘empty position’ in the template: position class indices regulate the relative order of 
morphs, but morph ordering is not effected by putting bits of phonology in slots, just by 
appending bits of phonology in order. More importantly, realisation rules may partially 
underspecify the position they assign morphs to, allowing one to capture an unprece- 
dented set of situations of variable morphotactics. Note also that, although a realisation 
rule may encode zero exponence, it is not equivalent to a zero morpheme: having no 
morph as one’s exponent is not the same thing as having a morph with no phonological 
realisation. In particular, since no empty morphs are postulated, no sybilline decisions 
need to be taken as to the positioning of inaudible elements. 


2.2 The role of the lexeme in IbM 


Now that we have outlined the main features of IbM, let us consider the role of the lexeme 
in such a framework. Remember that in classical HPSG, inflection rules take the form of 
unary rules relating an abstract sign, the lexeme, to a surface sign, the inflected word. 
IbM has no use for such a notion of inflection rule, since inflection is stated directly as 
a relation between content and form at the word level. On the other hand, IbM makes 
crucial use of the notion of a lexeme identifier to state lexeme-specific phonological 
and morphological information; and the word/lexeme opposition is still a useful way 
of capturing the relationship between lexical entries and inflected words, and making a 
clear distinction between lexeme formation and inflection. 

We thus assume that, while there are no inflectional lexical rules, there is a general 
constraint on objects of type word to the effect that they are the realisation of a lexeme, 
as indicated in (2). This constraint enforces the monotonic character of inflection: unlike 
derivation, inflection does not modify syntax or semantics but merely realises whatever 
features are made available by paradigm structure and compatible with the syntactic 
context. This is enforced by the identity of syNsEm values at the lexeme and word levels. 


V Tmplicit here are two assumptions familiar from Paradigm Function Morphology: (i) if two realisation rules 
are appropriate in some context, only the rule realising more content may apply (Panini's Principle); and 
(ii) there exists a universal rule of default non-realisation, ensuring that a property set remains unrealised 
if and only if the inflection system provides no other rule for its realisation. 
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SYNSEM 1 


(2) word > 
M-DIRS ( 


SYNSEM 1 


lexeme 
' 


As a consequence of (2), an inflected word will inherit any constraint imposed by 
the lexeme's lexical entry within syNsEM, including, crucially, lexical identity and stem 
alternants as specified through the Lip feature. Note that we assume the PHON attribute 
to be appropriate only for syn-sign objects (that is, words and phrases): lexemes constrain 
the phonology oftheir inflected forms through the srEMs feature instead (Bonami & Boyé 
2006). The inflection-specific features MPH, RR and sc are appropriate for words only. The 
format of lexical entries and lexeme formation rules is essentially unchanged. 


3 Lexemes and flexemes 


In this section we build on the general architecture just presented and argue that a dis- 
tinction between two notions of lexical identity needs to be made. 


3.1 Introducing the flexeme 


Up to now, we have assumed a simple relationship between lexemes and inflectional 
paradigms: the value of the same feature LID is used for purposes of lexeme selection and 
for purposes of individuating inflectional paradigms. In doing so we have been following 
standard practice in realisational morphology, where paradigm functions take ‘lexemes’ 
(Stump 2001, 2016) or equivalently a ‘lexemic index’ (Spencer 2013) as an argument. 

In an important but rarely cited paper, Fradin & Kerleroux (2003) note that matters 
are not so simple, for reasons having to do with lexical ambiguity and the division of 
labour between inflection and lexeme formation.» Rules of inflection are not generally 
concerned with matters of lexical ambiguity: from the point of view of inflection, the two 
French verbs DEVoIR; ‘must’ and DEVOIR, ‘owe’ are indistiguishable, as they have the 
same (highly irregular) inflectional paradigm. From the point of view of derivation, how- 
ever, things are different. Derived lexemes normally relate to one sense of their base: for 
instance, while the French noun FILLE is ambiguous between two readings FILLE, ‘girl’ 
and FILLE, ‘daughter’, the diminutive FILLETTE ‘small girl’ only relates to the first. Fra- 
din & Kerleroux (2003) argue that this warrants a distinction between two kinds of ab- 
stract lexical objects: lexemes and flexemes. Inflection is about flexemes, while derivation 
is about lexemes. Because of the pervasive nature of lexical ambiguity, a single flexeme 
often corresponds to multiple lexemes. 


We purposefully use the general term ‘lexical ambiguity’ because whether the relevant examples are in- 
stances of polysemy or homonymy does not affect the argument. 

léThis very short summary does not do justice to Fradin and Kerleroux's insights, which build on an exam- 
ination of the compatibility of various lexeme formation rules in French (Fradin & Kerleroux 2003) with 
various families of meanings. See also Fradin & Kerleroux (2009) for more discussion. 
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In the remainder we follow Walther (2013) in assuming that inflection is strictly con- 
cerned with flexemes, and propose an implementation of the lexeme-flexeme distinction 
in IbM. 


3.2 LID and PID 


Within an HPSG view of the world, it is tempting to capture the relationship between lex- 
emes and flexemes in terms of underspecification in an inheritance hierarchy: flexemes 
would then be abstract groupings of lexemes. Suppose for concreteness a hierarchical 
organisation of LID values such as that indicated in Figure 10. Rules of inflection can 
then be stated in terms of the supertype fille, while lexemes are properly individuated in 
terms of the subtypes; and hence FILLETTE can be uniquely related to the lexeme whose 
LID value is fille. 


lid 
| 
fille-lid 
STEMS /fij/ 


— 
filleTlid  fille2-lid 


Figure 10: A first pass at flexemes in HPSG: flexemes as underspecified LID 
values 


While this is technically feasible, such an approach only obscures the orthogonal 
roles played by the two notions. As illustrated above, IbM mp values are structured 
objects, which encompass all lexically-specified information relevant to inflection, in- 
cluding most notably stem alternants and inflection class. Such information is clearly 
irrelevant to syntax, although it is an indispensable component of inflection. On the 
other hand, studies that use LID for purposes of syntactic selection presuppose a tight 
correspondence between rip values and lexical semantic identity, and have no use for 
purely morphological information on stem alternants or inflection classes. In particular, 
Sag 2012 argues that rip values are to be identified with the main semantic predicate 
associated with a lexeme. One clear advantage of this convention is avoidance of redun- 
dancy in lexical entries: it is not necessary to postulate a new symbol as the Lip value of 
each lexeme, since such a symbol is already present in the lexical entry as the constant 
designating the lexeme's main semantic predicate. 

We now propose to clarify the situation by adopting Sag's view of 11D. This entails that, 
for purposes of inflection, a separate index must be posited that individuates words ac- 
cording to which flexeme they instantiate. We call this index PID, standing for ‘paradigm 
identifier’. While LID resides in HEAD and is thus available for selection in idioms, com- 
plex predicate constructions, or periphrastic constructions, PID is a top-level feature car- 
ried by signs of type lexeme only. As such it can be specified by lexical entries or manip- 
ulated by lexeme formation rules. In addition, it is universally constrained to be present 
among the features realised by inflection through inclusion in Ms, as indicated in (3). This 
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is crucial to ensuring that inflection is always concerned with the realisation of lexical 


identity. 
MS fe ad 
(3) word > lexeme 
M-DTRS ( » 
PD [i 


In this architecture then, lexical entries need to specify both an Lip and a PID value. To 
elaborate on the same example, an appropriate analysis of FILLE would posit two lexical 
entries sharing the same PID object while having different LID values, as indicated in 


Figure 11. 


lexeme lexeme 

CcAT|up|u [4] girl-rel cAT[|up|up [1] daughter-rel 
e Mia | 1 | | D E | 1 | 
PID fille-pid PID fille-pid 


Figure 11: Proposed lexical entries for the two lexemes FILLE. 


Under this analysis, the two lexemes FILLE are related by virtue of having indistin- 
guishable PIDs, but they are still distinguishable in terms of trp. Hence, as indicated in 
the lexical entry in Figure 12, the derived noun FILLETTE adds diminutive semantics (dim- 
rel) to the semantics of its base which is constrained to be that lexeme with LrD girl-rel, 
i.e., the left-hand lexeme in Figure 11. This captures the notion of formal lexical identity 
at the level of pm while implementing Fradin and Kerleroux's insight that derivational 
morphology operates on fully specific rather than underspecified lexemes. 


lexeme 
dim-rel 
cAT|up|upn [1 
sé [sr E 
SEM|RESTR (m a 
PID fillette-pid 
lexeme 
girl-rel 
caT|HD|LID [2 
M-DTRS (ss [NST d 
SEM|RESTR fe 
PID fille-pid 


Figure 12: Proposed lexical entry for the lexeme FILLETTE ‘small girl’. 


186 


8 Lexeme and flexeme in a formal theory of grammar 


3.3 Individuating flexemes: stem spaces 


We now turn to the nature of pid objects. Evidently, there should be enough distinct 
PID values to be able to distinguish each flexeme from one another; that is necessary 
and sufficient to capture the notion of a flexeme. In the context of a typed-feature struc- 
ture ontology, however, it is very natural to use PID to capture all aspects of inflectional 
identity. We thus take PIDs to be structured objects providing enough phonological and 
inflectional information to deduce a whole paradigm with minimal redundancy: Hence, 
at the very least, for the simplest inflectional systems, a basic stem. For systems of any 
complexity, this basic information needs to be supplemented with inflection class infor- 
mation (if there is more than one inflectional strategy) and information on stem alter- 
nants (if there are unpredictable stem alternations). 

We illustrate a simple approach to the encoding of stem alternations by adapting the 
HPSG analysis of French conjugation presented in Bonami & Boyé (2006). French verbs 
exhibit pervasive stem alternations, illustrated in Table 1 in the indicative present sub- 
paradigms. Regular verbs from the first conjugation use a uniform stem in the present, 
and regular verbs from the second conjugation use an augmented stem in /-s/ in the 
plural. In addition to these two patterns, however, there are hundreds of irregular verbs 
instantiating others, which can be grouped into three types: either there is one stem for 
the singular and one for the plural, or the same stem is used for the singular and for 
the third plural, or three different stems are used following the pattern illustrated by 


BOIRE.” 


Table 1: Sample French present indicative paradigms illustrating recurrent stem 
alternation patterns 


1SG 2SG 35G 1PL 2PL 3PL 
LAVER ‘wash’ lav lav lav lav-õ lav-e lav (1st conjugation) 
FINIR ‘finish’ fini fini fini finis-5 finis-e finis — (2nd conjugation) 


ENVOYER send  üvwa vwa vwa  üvwajo  üvwaje  üvwa 
JOINDRE ‘join’ 3wée 3WÉ ZWË 3Wan-5 3wap-e zwan (other patterns) 
BOIRE ‘drink’ bwa bwa bwa  byv-5 byv-e bwav 


Given the pervasive nature of these alternations and the general unpredictability of 
the shapes of the alternants, Bonami & Boyé (2003a) build on previous work by Aronoff 
(1994), Brown (1998), Hippisley (1998), and Stump (2001), and posit that each lexeme is 
associated with a STEM SPACE, a vector of phonological shapes indicating the shape of the 
stem used in some zone of the paradigm. Limiting attention again to the stems found in 
the indicative present, the stem space of the verbs under consideration is indicated in 
Table 2: Stem 1 the default stem, Stem 2 is used in the 3PL, and Stem 3 is used in the 
singular. 


Bonami & Bové (2006) deliberately set apart a handful of highly irregular and very frequent verbs instan- 
tiating an unpredictable form in the 1sc, 1PL or 2PL. 
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Table 2: Stem spaces for a sample of French verbs in the present indicative 


Stem1 Stem2  Stem3 


LAVER ‘wash’ lav lav lav 
FINIR ‘finish’ finis finis fini 


ENVOYER ‘send’ Gvwaj vwa vwa 
JOINDRE join | "wan  zwan "wë 
BOIRE ‘drink’ byv bwav bwa 


In the context of an Item-and-Process view of inflection, Bonami & Boyé (2006) pro- 
pose to encode stem spaces as the value of a feature carried by lexemes, and posit a 
hierarchy of stem space types capturing different patterns of identity among coordi- 
nates in the stem space. This analysis can be readily adapted to the current framework 
by assuming that stem spaces are represented inside pid objects using a list-valued fea- 
ture STEMS. Let us first consider the lexical entry of BOIRE ‘drink’. This needs to list three 
unpredictable stems, as indicated in Figure 13. 


lexeme 


Ss|cAT|up 


verb 
LID drink-rel 
boire-pid 

» 


Sai STEMS </byv/,/bwav/,/bwa/ 


Figure 13: Lexical entry for BOIRE ‘drink’ 


The grammar then needs to specify in which context each element in sTEMs is to be 
used. Following insights from Stump (2001: chap. 6), we assume that this is effected by 
STEM SELECTION RULES, a special kind of realisation rule that selects a stem alternant for 
insertion. The relevant rules are presented in Figure 14.18 

The first rule states that, by default, lexical identity (i.e. PID) is realised by inserting 
the first element on the stems list as a morph in position 0.? The two other rules add 
some allomorphic conditioning: the second element is only used if the morphosyntactic 
context is that of a 3PL subject, while the third is used when it is that of a sc subject. 

Note that the stem selection rules are in no way sensitive to inflection class. This is in 
keeping with Bonami and Boyé's (2003b, 2006) analysis, which starts from the assump- 
tion that all variation in French conjugation originates in differential distributions of 


18We use the em dash (‘—’) to denote an unconstrained string of segments. ‘—’ in a srEMs value thus indicates 
that the shape of that stem is not constrained by the rule, type, or lexical entry under consideration. 

P?This rule can be thought of as capturing an inflectional universal, as it simply states that some stem must be 
provided for every word. In systems without unpredictable stem allomorphy, this will be the sole element 
on the sTEMs list. In systems with stem allomorphy, by convention, we place the default stem alternant 
first. 
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PH U | 
MPH | ee, E 1 
PH uU Ae 0 PC 0 
MPH Sc d L 
id : 
MUD i pid 
pid STEMS (7. MUD) | stems —,— E 
MUD 1 ` i E 
STEMS m Spe 3 | | 
MS a MS NUM sg. 
NUM pl l | 


Figure 14: Stem selection rules for French present indicative 


alternants in the stem space. That being said, it is useful to characterise classes of flex- 
emes in terms of the patterns of identity they instantiate. In the present context, such a 
classification can be stated in the form of a type hierarchy of pid objects, as indicated in 
Figure 15. 


pid 


AAx-pid . | 
ane) full-irreg-pid 


xBB-pid 
stems <—2][2) 


reg-II-pid 


-I-pid ABB-pid 
STEMS <(1]+/is/J1|+/is/,11 ml TE Ss 


AAB-pid 


Figure 15: Hierarchy of pid subtypes capturing aspects of the French verbal 
stem space 


The hierarchy of pid objects highlights the structure of the system, and allows the 
grammar writer to minimise redundancy in the stamement of lexical entries. In particu- 
lar, all regular verbs can be described with mention of the first stem only, while different 
types of irregulars necessitate information on two or more stems in different coordinates 
of the stem space. More sample lexical entries are provided in Figure 16 for illustration. 
Note that the lexical entry for BOIRE of Figure 13 does not need to mention a subtype of 
pid explicitly, since full-irreg-pid is the only subtype compatible with the listing of three 
distinct stems. 

Finally, the distinction between PID types and stem inventories provides a simple ac- 
count of situations where two verbs belonging to different stem alternation types have 
the same basic stem, as is the case e.g. with TAPIR ‘hide’ and TAPISSER ‘paper’, wich have 
both have a basic stem /tapis/, witness the ambiguous PRs.1PL /tapis5/ ‘we hide'/^we pa- 
per’. Figure 17 shows the relevant lexical entries. 
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lexeme lexeme 
Ss|CAT|HD nid | Ss|cAT|up idi | | 
LID wash-rel LID finish-rel 
reg-I-pid reg-II-pid 
STEMS (/lav/ ex STEMS ec 
lexeme lexeme 


verb 
LID send-rel 


ABB-pid | 
H 


ss|CAT|HD ss|CAT|HD 


erb 
LID join-rel 


AAB-pid | 
» 


STEMS (/àvwaj//àävwa/,— STEMS (/zwap/,—,/3w&/ 


Figure 16: Lexical entries for a sample of French verbs 


lexeme lexeme 


verb 


ss|CAT|HD Ss|CAT|HD 


verb 

LID hide-rel 

reg-II-pid | 
D 


LID paper-rel 


reg-I-pid 
) 


sTEMS </tapis/,—,— STEMS (/tapis/,—,— 


Figure 17: Lexical entries for two French verbs with homophonous basic stems 


To sum up then, PID provides a natural locus for the representation of lexical infor- 
mation on stem alternations, and allows for a natural encoding of Bonami and Boyé's 
notion of a stem space. In addition, in a system where (by hypothesis) all variation in 
inflection is located in the stems, the indication of a specific vector of stem alternants is 
sufficient to fully individuate flexemes. In such a system, the hierarchy of pid values is 
merely used to limit the statement of redundant information in lexical entries. 


3.4 Individuating flexemes: affixal inflection classes 


We now turn to the role of erp in a system with nontrivial affixal inflection classes. As an 
illustration, let us examine a subset of the Czech nominal declension system. Table 3 pro- 
vides partial paradigms for four nouns belonging to four of the major inflection classes 
of masculine inanimate and neuter nouns. 

The distinction between hard and soft declension is correlated with the phonological 
properties of the stem-final consonant; however, it is not in general possible to categor- 
ically predict whether a noun will belong to a hard or soft declension on the basis of 
the phonological shape of its stem. Groups of declensions do share characteristics of 
exponence; in particular, it is evident from the table that some exponent strategies are 
common to the soft declensions (e.g. -e marking the GEN.sG), to the masculine declen- 
sions (e.g. -ú in the GEN.PL), or to larger groups of declensions (e.g. -úm is used in the 
DAT.PL accross the declensions shown here, except in the soft neuter). These observations 
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Table 3: Partial declension for the four inflection classes of Czech inanimate 
nouns 


MASCULINE NEUTER 
hard soft hard soft 
NOM | most pokoj mést-o moi-e 
sa DEN most-u pokoj-e mést-a moi-e 
DAT | most-u pokoj-i mést-u moř-i 
ACC most pokoj mést-o moř-e 
voc most-e pokoj-i mést-o moi-e 
LOC most-é pokoj-i mést-é moř-i 
INS most-em  pokoj-em | méstem moř-em 
NOM | most-y pokoj-e mést-a moi-e 
pr GEN most-ü pokoj-ü mést moř-í 
DAT most-üm pokoj-üm | měst-ům moï-im 
ACC most-y pokoj-e mést-a moi-e 
voc most-y pokoj-e mést-a moi-e 
LOC most-ech  pokoj-ich | mést-ech  moï-ich 
INS most-y pokoji mést-y moř-i 
‘bridge’ ‘room’ ‘town’ ‘sea’ 


motivate arranging flexemes in a hierarchy of classes, so that the application of rules of 
exponence can be restricted to arbitrary collections of declension classes. We thus pro- 
pose a simpler hierarchy of pid objects reflecting the distinction between hard and soft 
declensions, as indicated in Figure 18. 


pid 


hard-pid soft-pid 


Figure 18: Premiminary hierarchy of pid subtypes for Czech declension 


In addition, we propose that, since gender is inherent for nouns (in contrast to agree- 
ment gender) yet still conditions inflectional realisation, it should be represented as part 
of PID. Hence the lexical entries of the 4 nouns under consideration are as indicated in 
Figure 19. Note that traditional declensions correspond to a combination of a pid subtype 
and a gender value.?? 


20This bidimensional representation of declension classes is possible because gender is a strict predictor of 
inflection class in Czech: all members of each declension class belong to the same gender. Some declension 
classes corresponding to different genders are very similar, but always differ in at least one paradigm cell: 
e.g. masculine TÂTA ‘dad’ inflects like a feminine hard noun in only about half of its paradigm cells. Also 
note that a full description of the system would require more subtypes of pid, as there are more than 
two classes per gender, and hence organizing the pid hierarchy as a dense semi-lattice of inflection class 
groupings (Beniamine & Bonami 2016-09). 
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lexeme lexeme 
noun noun 
ss|CAT|HD , Ss|cAT|HD 
LID bridge-rel LID room-rel 
hard-pid soft-pid 
PID |GEN mas PID |GEN mas 
srEMs </most/) stems </pokoj/) 
lexeme lexeme 
noun noun 
ss|CAT|HD ss|CAT|HD 
LID town-rel LID sea-rel 
hard-pid soft-pid 
PID |GEN neu PID |GEN neu 
STEMS (/mést/) STEMS (mot 


Figure 19: Preliminary lexical entries for a sample of Czech nouns 


ve 


gs-rule 


MPH | [Pe j 


CASE gen 
MU: 
NUM Sg 
| | | 
WÉI Ja 
MPH den vol MPH den can] MPH fox ven] 
hard-pid hard-pid | 
ms GEN An MS GEN. "eun MS [soft-pid, -| 


Figure 20: Preliminary realisation rules for Czech cEN.sc 


To see how this hierarchy helps in capturing the distribution of exponents in Czech, 
consider the partial hierarchy of rules of exponence for the expression of GEN.sG in Fig- 
ure 20. The three rules have the same general structure: they associate a specific phono- 
logical shape with the expression (through the von value) of the GEN.sG, but place a 
condition on that expression by restricting the Ms value to contain specific information 
in its pid value. That is, they limit the use of an exponent to flexemes belonging to a 
particular inflection class or group of inflection classes. The first two rules express the 
conditioning in terms of both a type in the pid hierarchy and a gender value. The third 
one, however, does not mention gender, and hence can apply both in the case of mascu- 
line and neuter soft nouns. 

This simple example illustrates how the typed feature structure architecture allows for 
a straightforward statement of generalisations on exponence across declension types by 
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locating inherent inflectional information in PID values and conditioning the application 
of rules of exponence to families of possible PID values. 

We conclude this section by noting that the use of stem spaces, inherent features 
such as gender, and type of PID does not necessarily exhaust the inventory of relevant 
information that should be coded inside pip for the languages of the world. For instance, 
Bonami & Lacroix (2011) proposed that lexical information on thematic suffixes in the 
conjugation of the Kartvelian language Laz should be stored as the value of a dedicated 
feature inside the PID, since information on the shape of the thematic affix needs to be 
lexically stipulated but the affix is neither always present nor always contiguous to the 
root; and Crysmann & Bonami (2017) propose a concrete implementation of that idea in 
the context of Estonian declension. Our general claim is that pp should be the sole locus 
of lexically stipulated information on inflection. 


4 Flexemes and overabundance 


In previous sections we have justified the distinction between lexemes and flexemes by 
arguing that a single flexeme (characterised by a single inflectional paradigm) may corre- 
spond to multiple lexemes (characterised by different lexical semantic and/or syntactic 
properties). In this final section we explore situations where one may want to argue the 
opposite: multiple flexemes corresponding to a single lexeme. 

Although we have not made use of it yet, the analytic scheme defined in the previous 
section certainly leaves room for such a possibility. Both for French verbs and Czech 
nouns, we have proposed that pid objects be organised in a hierarchy, capturing fam- 
ilies of inflectional behavior. The lexical entries used thus far all introduce a PID value 
corresponding to a specific leaf type in the hierarchy: hence one flexeme for each lexeme. 
However, if some lexical entries were to refer to some pid supertype, this would autho- 
rise multiple inflectional behaviours for the same lexeme - hence, in a sense, multiple 
flexemes for one lexeme. 

As a matter of fact, both French conjugation and Czech declension provide examples 
of phenomena that are insightfully analysed in this fashion. The phenomena at hand fall 
under the general heading of OVERABUNDANCE (Thornton 2011, 2012, to appear), that is, 
of situations where a single lexeme has multiple realisations for the same set of mor- 
phosyntactic properties. 

First consider the French verb AssEorR. There is considerable variation in the realisa- 
tion of different paradigm cells of this verb, leading to free variation at least for some 
paradigm cells in some varieties (Bonami & Boyé 2010). Limiting ourselves again to the 
indicative present, there seem to be two equally felicitous forms for each person-number 
combination in Standard French, as indicated in Table 4. 

Although this situation could be described in terms of overabundance in individual 
paradigm cells, such an approach would not capture the fact that the forms seem to be 
organised in two distinct paradigms, each with two stem alternants, and each instanti- 
ating a different but familiar pattern of stem allomorphy: the /aswa/ /aswaj/ contrast 
follows an ABB pattern similar to that of ENVOYER (see Table 1), while the /asje/ /asej/ 
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Table 4: The two main indicative present subparadigm of ASSEOIR ‘sit’ 


1SG 28G 38G 1PL 2PL 3PL 


aswa aswa aswa  aswaj-9 aswaj-e aswa 
asje asje asje asej-5 asej-e asej 


contrast follows an AAB pattern similar to that of JOINDRE. It is thus more perspicuous to 
describe this case of overabundance as involving two different stem spaces, and hence 
two different PID values, rather than variation in individual paradigm cells. Figure 21 
shows two appropriate lexical entries corresponding to the two paradigms of ASSEOIR 
that readily integrate with the analysis presented in Section 3 and account for overabun- 
dance directly. 


lexeme lexeme 
verb verb 
Ss|cAT|up ; ss|CAT|HD : 
LID sit-rel LID sit-rel 
AAB-pid ABB-pid 
i STEMS </asej/,—,/asje/) STEMS </aswaj/,/aswa/,—) 


Figure 21: Lexical entries for two variants of the verb AssEOIR ‘sit’ 


The French verb AssEorR exemplifies a case of stem-based overabundance, which is 
readily accommodated by having two stem spaces for a single lexeme. Let us now turn 
to Czech and discuss a situation of exponent-based overabundance. 

In Section 3.4 we discussed the fact that the Czech inflection system distinguishes 
‘hard’ and ‘soft’ declensions. As it happens, some lexemes follow a hybrid or ‘mixed’ 
pattern that does not clearly fall into one type or the other, but rather makes use of 
both hard and soft exponents. However, this has different manifestations for neuter and 
masculine inanimate nouns, as evidenced by the examples in Table 5. 

The paradigm of the mixed neuter noun KUXE ‘chicken’ exhibits HETEROCLISIS (Stump 
2006): KURE inflects like a soft noun in the singular, but like a hard noun in the plural. 
By contrast, the paradigm of the mixed masculine noun PRAMEN ‘spring’ exhibits a com- 
bination of heteroclisis and partial overabundance. In the plural, PRAMEN inflects like a 
hard noun; in the singular, it may inflect either like a hard noun or like a soft noun. Cor- 
rectly capturing the difference between these two types of mixed inflectional behaviour 
is a serious challenge for any theory of inflection. 

Both behaviours are readily accomodated in the present framework, using a more 
refined hierarchy of PID values. The crucial insight is that overabundance amounts to 
ambiguity, i.e. disjunctive membership of two inflection classes, whereas heteroclisis in- 
volves simultaneous membership of two classes: while the former is modelled straight- 
forwardly by means of underspecification, corresponding to the JoIN in the semi-lattice 
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Table 5: Overabundance and Heteroclisis in Czech declension 


MASCULINE NEUTER 

hard mixed soft hard mixed soft 

NOM most pramen pokoj mést-o  kuï-e moi-e 
sa SEN most-u  pramen-u-pramen-e pokoj-e mést-a kui-et-e moř-e 
DAT most-u  pramen-u-pramen-i  pokoj-i měst-u kuř-et-i moř-i 
ACC most pramen pokoj mést-o  kuï-e moř-e 
voc most-e pramen-e~pramen-i  pokoj-i mést-o kuï-e moř-e 
LOC most-é  pramen-u-pramen-i  pokoj-i mést-é  kuï-et-i moř-i 
INS most-em pramen-em pokoj-em mést-em kuï-et-em moï-em 
NOM most-y pramen-y pokoj-e mést-a ^ kuf-ata ` moi-e 

p, GEN most-ü pramen-ü pokoj-à mést kuï-at mof-i 
DAT most-üm pramen-üm pokoj-üm mést-üm kuï-at-üm  moi-ím 
ACC most-y pramen-y pokoj-e mést-a  kuï-at-a ` moi-e 
voc most-y pramen-y pokoj-e mést-a ^ kuf-ata ` moi-e 
LOC most-ech pramen-ech pokoj-ích mést-ech kui-at-ech moi-ích 
INS most-y pramen-y pokoji mést-y  kuï-at-y  moï-i 

‘bridge’ ‘spring’ ‘room’ ‘town’ ‘chicken’ ‘sea’ 

pid 
hard-pid soft-pid 


strict-hard-pid mixed-pid — strict-soft-pid 


Figure 22: Improved hierarchy of pid subtypes capturing heteroclite Czech de- 
clension classes 


of pid types, the latter can be captured by overspecification, i.e. the MEET, as shown by 
the type hierarchy in Figure 22. 

Figure 23 shows schematically to which prp value each noun is assigned, and Figure 24 
which pip value rules of exponence for the cEN.sc (left hand side) and vo. pt. (right hand 
side) are restricted to. More detailed lexical entries and rules of exponence are presented 
below in Figures 25 and 26. Any noun can be inflected using a realisation rule declared 
with a compatible pip value. That is, any point in the hierarchy that is identical to that 
of the noun, dominates it, or is dominated by it. 

As shown in Figure 23, nouns belonging to non-mixed declensions are assigned to 
either of the two simple leaf types strict-hard-pid (Most, MESTO) and strict-soft-pid (POKOJ, 
MORE). The heteroclite noun KURE is assigned to mixed-pid, and hence may inflect using 
either hard or soft exponents, but not strict-hard or strict-soft ones. The assignment of 
exponents to pid values (shown in Figure 24) ensures that it must use soft exponents 
in the singular, yet hard exponents in the plural. By contrast, the overabundant noun 
PRAMEN is assigned to an underspecified inflection class, namely hard-pid. As such it may 
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pid 


hard-pid soft-pid 


mixed-pid strict-soft- pid. n 
PI se, Te Pee 
Steen eee Fe 


Figure 23: Schematic representation of inflection class assignment for Czech 


nouns 


soft-pid 


strict-soft- pid 


mixed-pid 


Figure 24: Schematic representation of the scope of rules of exponence for 
Czech nouns 


lexeme lexeme lexeme 
noun noun noun 
Ss|cAT|up . ss|CAT|HD . ss|CAT|HD 
LID bridge-rel LID spring-rel LID room-rel 
strict-hard-pid hard-pid strict-soft-pid 
PID |GEN mas PID |GEN mas PID |GEN mas 
STEMS </most/) STEMS (/pramen/) stems (/pokoj/) 
lexeme lexeme lexeme 
noun noun noun 
ss|CAT|HD ; ss|CAT|HD ' ss|CAT|HD 
LID city-rel LID chicken-rel LID sea-rel 
strict-hard- GE mixed-pid soft-pid 
pip |GEN ` neu pip |GEN mas 
STEMS (/mésto/) stems (/kufe/) stems (/moïe/) 


Figure 25: Lexical entries for six Czech nouns 
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n tpid-ajos-ous | SW 


| kan na] 


HdW 
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sunou Td'WON pue? 9S'N39 129Z;) 10} SIMI uoresi[eog :97 ANSIA 


nau Nao sou NID nau Nao spu Ma 
"ES SW id SW [= pid-yos | SW R SW SE SW 
pid-pivy pid-pavij pid-pavi-1o14is pid-pavij 
ken al HdW | 
Ken na] HdW fos nal HdW ren nal | HdW bo na] HdW 
f id Wan Ss WON 
ann ann 
wou asvo uad ssvo 
ajna-du apna-sg 
Il ol Haw 
3]n4-]22p 


apna-u]4 
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use any one of hard-pid, strict-hard-pid, mixed-pid, or soft-pid exponents, but, crucially, 
not strict-soft exponents. This accounts pretty concisely for its contrasting behaviour in 
the singular and the plural: since the GEN.sG exponent -e is only soft-pid, there are two 
GEN.SG exponents available for PRAMEN, which is thus overabundant: inflection with -e 
by resolving soft-pid demanded by the rule and hard-pid demanded by PRAMEN to the 
hetoroclite type mixed-pid, or else with -u, by the sheer fact that this is the exponent 
available for all hard-pid words, whether strict or heteroclite. By contrast, the NOM.PL 
exponent -e is constrained to strict-soft. As such, it is inaccessible to PRAMEN, which 
hence behaves like a simple hard masculine noun in the plural. 

We have thus established that mixed overabundant declensions can be accommodated 
by assigning a lexeme to a supertype in the pid hierarchy, while mixed heteroclite de- 
clensions can be accommodated by introducing a subtype intermediate between the hard 
and soft declensions. 

The discussion in this section has exhibited the benefits of associating multiple pid 
objects with a single LID value to address some situations of overabundance; which 
amounts to positing that a single lexeme may correspond to multiple flexemes. We by 
no means claim that all overabundance phenomena are best thought of in such terms; 
See Thornton (this volume) for relevant discussion. Rather, we suggest that, where over- 
abundance results from a lexeme being ambiguous between two classes of paradigms, 
lexically underspecified pids make good sense of the situation. 


5 Conclusions 


In this paper we have addressed the representation of lexical identity in morphology. 
Following Fradin & Kerleroux (2003), we have argued that a distinction should be made 
between lexemes, individuated in terms of lexical semantics, and FLEXEMES, individuated 
in terms of inflectional paradigms. We have then shown that lexemes and flexemes stand 
in a many-to-many relation: in cases of lexical ambiguity, one flexeme realises multiple 
lexemes; in at least some situations of overabundance, multiple flexemes realise the same 
lexeme. We have shown how this distinction can be integrated into Information-based 
Morphology by providing words with two independent indices: LID and PID. 

The distinction between LID and pip clarifies the role of lexical identity at the inter- 
face between inflectional morphology and syntax: syntax cares about lexemes, but not 
flexemes; inflectional morphology cares about flexemes, but not about lexemes. In the 
present framework, this is captured by the fact that LID is not represented in ms, the 
input to rules of inflection. Arguably, the distinction is also useful to clarify the role of 
lexical identity in lexeme formation. Recent work on French lexeme formation has high- 
lighted the many-to-many nature of lexeme formation rules (see Bonami & Crysmann 
2016: 83.1 and references cited therein): typically, a single formal process may be associ- 
ated with multiple meanings, and the same type of meaning may be realised by multiple 
processes. Bonami & Tribout (2012) and Tribout & Bonami (2014-07) explore how the 
LID/pid can be used to make sense of that distinction. In their analytic scheme, lexeme 
formation rules are organised in a bidimensional multiple inheritance hierarchy, with 
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one dimension laying out formal strategies, and the other dimension describing a syn- 
tactic/semantic operation. Formal strategies determine a new pid from that of the base, 
while syntactic/semantic operations amount to constructing a new LID from that of the 
base. 

More work is needed to integrate Bonami and Tribout's insights into IbM, but this 
integration paves the way towards a general, underspecification-based framework for 
morphological analysis. 
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The morphology of essence predicates in 
Chatino 


Hilaria Cruz 
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Gregory Stump 


Professor emeritus, University of Kentucky 


In the Chatino language [Oto-Manguean; Mexico], essence predicates are a class of pred- 
icative lexemes exhibiting a special complex of properties that distinguishes them from 
other kinds of predicates. We characterize this complex of properties with evidence from 
the San Juan Quiahije (SJQ) variety of Chatino. After examining the principal morphosyn- 
tactic characteristics of essence predicates, we focus particular attention on their patterns 
of person/number marking, on which basis we distinguish two possible hypotheses about 
the grammatical status of essence predicates: the possessed-subject hypothesis and the com- 
pound predicate hypothesis. We then assess these hypotheses in light of four kinds of evi- 
dence: the structural variety of essence predicates, their external syntax, their general lack 
of semantic compositionality, and their relation to the distributional flexibility of subject- 
agreement marking in Chatino. On the basis of this evidence, we conclude that neither the 
possessed-subject hypothesis nor the compound predicate hypothesis is fully adequate; we 
therefore propose an alternative way of situating essence predicates in the wider context of 
Chatino morphosyntax. 


1 Introduction 


Our intention here is to characterize a distinctive class of predicates in Chatino; we 
call this the class of ESSENCE PREDICATES. As we show, the members of this class share 
certain distinctive morphosyntactic characteristics; at the same time, they are also het- 
erogeneous with respect to various criteria. Their interest here resides in the superficial 
ambiguity of their structure: in some ways, this resembles the syntactic combination 
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of a verb and its subject, while in other ways, it resembles the morphological structure 
of a compound predicate. In section 1, we examine the fundamental features of essence 
predicates. Their patterns of person/number marking (section 2) suggest two alternative 
analyses of their structure, one syntactic, the other morphological. In section 3, we exam- 
ine four characteristics of essence predicates as a way of gauging the relative adequacy 
of the two competing analyses. In view of the equivocal outcome of this examination, 
we conclude (section 4) that essence predicates are, in fact, neither verb-subject combi- 
nations nor ordinary compound predicates, but lexemes whose realization is invariably 
periphrastic and whose content stems from the special function of a handful of gram- 
maticalized nouns. 


2 Basic characteristics of essence predicates 


One of the defining features of essence predicates is their structure, which comprises a 
predicative base followed by a nominal component. For example, the essence predicate 
ndi riq? ‘s/he was thirsty’! comprises the predicative base ndi* ‘be thirsty’ and the noun 
rig’ ‘essence’; its inflectional paradigm is given in Table 1. Essence predicates exhibit a 
wide range of predicative bases, but there is only a handful of choices for the nominal 
component, the most common being riq?. 


Table 1: Paradigm of the essence predicate ndi rig? ‘s/he was thirsty’ [thirsty 
essence] in SJQ Chatino 


COMPLETIVE PROGRESSIVE HABITUAL POTENTIAL 
eo — ndi* reng” ndi? reng”? ndyi* reng” tyi? reng” 
256 ndi riq' ndi” rig! ndyi* riq' tyi? rig! 
ao ndt riq? ndi” rig? ndyi* riq? tyi? rig? 

di! 2 en! nd? Ze pgi 20460 ry icy 
uNCL ndi reng en ndi” reng en! ndyi* reng en tyi” reng en 
1ExCL ndi riq? wa"? ndi? riq? wa?  ndyi^ riq? wa? tyi” riq? wa? 
2PL ndt riq? wan! ndi? riq? wan! ndyi riq? wan! tyi? riq? wan! 
3PL ndi riq? renq! ndi? riq? renq! ndyi* riq? renq! tyi? riq? renq! 


In view of its structure, the inflectional morphology of essence predicates differs from 
that of simple verbal lexemes. These differences can be seen by comparing the inflectional 


!Here and throughout, we generally use a verbis third-person singular completive form as its citation form; 
deviations from this practice are duly noted. We employ the following abbreviations: cPL ‘completive as- 
pect’, PROG ‘progressive aspect’, HAB ‘habitual aspect’, por ‘potential mood’; DEM ‘demonstrative’; ABS 
signifies that a referring expression’s referent is absent; Ess = rig”, tye?? or qin‘; EV.MOD = event modifier; 
and cBM = cranberry morpheme. A superscript 0 represents a floating super high tone, 1 a high tone, 2 a 
mid high tone, 3 a low mid tone, and 4 a low tone. Contour tones are represented as combinations of these 
numerals. For details concerning the SJQ Chatino tone system, see Cruz (2011), Woodbury (to appear). 
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paradigm of the essence predicate ndi* riq? ‘s/he was thirsty’ in Table 1 with that of the 
simple verbal lexeme yqan? ‘s/he washed’ in Table 2.” 


Table 2: Paradigm of the simple verbal lexeme yqan?? ‘s/he washed’ in SJQ 


Chatino 
COMPLETIVE PROGRESSIVE HABITUAL POTENTIAL 
1SG yqan?* ntyqan! ntyqan* voan") 
2SG voan" ntyqan?? ntyqan*? yqan*? 
3SG voan)? ntyqan?? ntyqan^* voan?) 
uNCL yqan® an?  ntyqan| am _ntyqan"* yqan"* 
1EXCL yqan? wa“? ntyqan?? wa“? ntyqan\* wa“? yqan'* wa? 


32 32 


2PL ` van! want ntyqan? wan* ntygqan** wan? yqan™* wan 
42 


3PL  yqan” rengt ntyqan? rengt ntyqan^* reng" yqan?* reng" 


As Table 2 shows, the singular forms of a simple verbal lexeme are single, synthetic 
word forms inflected both for aspect/mood and for subject person and number. The cor- 
responding plural forms consist of a verb form inflected for aspect/mood and an enclitic 
pronominal element marking person and number; in general, this pronominal element 
appears only in the absence of an overt subject constituent, in the presence of which 
the verb simply appears in its default third-person singular form. As Table 1 shows, 
essence predicates differ from simple verbal lexemes in satisfying what Rasch (2002) 
calls the Compound Inflection Criterion, according to which an essence predicate ex- 
hibits aspect/mood marking on its predicative base but person and number marking on 
its nominal component, where, again, the marking of plural persons takes the form of an 
enclitic that only appears in the absence of an overt subject constituent. The one compli- 
cation is that in the first-person plural inclusive, subject agreement is marked twice, not 
only by the clitic en’, but also by ablauting of the nominal component, which appears as 
rend? rather than as riq? in Table 1. 

Tables 1 and 2 show that the essence predicate ndi* riq? is like a verb in inflecting for 
aspectual and modal properties; but not all essence predicates are similarly verb-like. 
We take this as evidence that essence predicates are heterogeneous with respect to their 
syntactic category membership. In SJO Chatino, the criteria in (1) are diagnostics of the 
distinction between verbs and adjectives. By criterion (1a), the predicate yqan“? ‘s/he 
washed' in Table 2 is a verb because it exhibits distinct completive, progressive, habitual 


"The (ct clitic appearing as en! in Table 1 and as an* ~ an! in Table 2 gets its vowel quality from its host 


and is manifested as a lengthening of the preceding vowel mora. (Note, however, that verbs with tone 14 
do not undergo mora lengthening in the first person inclusive, so that superficially, they appear to lack the 
1INCL enclitic, as in Table 2.) Its tone is generally determined by a process of progressive tone sandhi (Chen 
2004); but verbs whose basic tone is 4 instead exhibit a regressive process by which their tone becomes 24. 
It is evidently the historical reflex of a clitic that was once constant in form. This constant form survives 
as the clitic nat in Zenzontepec Chatino (Campbell 2011). For details of the idiosyncratic sandhi exhibited 
by the 1INCL enclitic, see Cruz (2011). 
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and potential subparadigms. By contrast, the predicate tqi* ‘sick’ in Table 3 does not, and 
is therefore an adjective according to criterion (1a). Similarly, yqan* and tqi* may both 
be used predicatively (as in (2)), but only tqi* is used attributively (e.g. (3a)); in order to 
modify a noun as part of a noun phrase, ygan** must appear as part of a relative clause 
introduced by the pronominal no‘ ‘one’, as in (3b). Thus, criterion (1b) also leads to the 
conclusion that yqan? is a verb and raf, an adjective. 


(1) a. Verbs inflect for aspect and mood, but adjectives do not. 


b. Adjectives may be used attributively, but verbs cannot (except as part of a 
relative clause). 


(2) a. tqi* nof Moul, 


sick one(s) male 


“The men are sick: 


b. ntyqan? no‘ kiqyu’. 


wash.PROG one(s) male 
"Ihe men are washing: 


(3 a no‘ kiqyu! tqi* 
one(s) male sick 
‘the sick men’ 
b. not  kiqyu!"(no*) ntyqan? 
one(s) male  one(s) wash.PROG 
‘the men that are washing’ 


Table 3: Paradigm of the adjective tqi* ‘sick’ in SJQ Chatino 


1SG tqen 
286 tq? 
ao  tqi* 


UNCL tqen™* en? 


1EXCL tqi wa? 
2PL  tqi* wan 
3PL  tqi* rengt 


4 


By these diagnostics, it appears that some essence predicates are verbs and others, 
adjectives. Unlike the essence predicate ndi* riq? ‘s/he was thirsty’ but like the adjective 
tqi^ ‘sick’, the essence predicate tqi^ riq? [sick essence] ‘s/he was scornful’ in Table 4 
does not inflect for aspect and mood. Moreover, a comparison of (4) and (5) reveals that 
while tqi riq? readily appears in attributive position, the attributive use of ndi* riq? 
requires a relative clause construction. Thus, although ndi* riq? and tqi* rig? are both 
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essence predicates, the diagnostics in (1) suggest that the former is a verb? and the latter, 
an adjective.* 


Table 4: Paradigm of the essence predicate tqi* riq? [sick essence] ‘s/he is scorn- 


ful' in SJO Chatino 


eo — tqi* reng”? 
2856 tqt riq 

3sG  tqi* rig’ 
INCL tqi reng? enl 
IEXCL tqi* rig? wa“? 
2PL  tqi* rig’ wan! 


3PL tqi* rig’ renq! 


(4) Ntqan^ ` qin? not kiqyu!tqi^ rig? qnya! kanal", 
see.CPLi1SG ACC one male sick essence.3sG me DEM 
‘I saw the guy scornful of me’ 


(5 Ntgan** qin? no* kiqyu! *(no*) ndi? riq’kanq”. 
see.CPL.1sG ACC one male “(one) thirsty.PROG essence.3SGDEM 


‘I saw the guy who is thirsty’ 


Most essence predicates denote a particular psychological state or disposition, as the 
representative examples in Table 5 reveal. Some essence predicates, however, denote a 
physical state, as in Table 6; and there are also occasional examples that have an active 
rather than a stative or dispositional meaning, as in Table 7. 

In nearly all cases, riq? ‘essence’ seems to be interpretable as ‘X’s self’, making the 
essence predicate similar to a lexically reflexive verb in a language like French; skeq! 
riq? ‘il se méprend’, sqwi* riq? ‘elle se souvient’, ndwe* riq? ‘il s'inquiète’, tno* nga* 
tye? ‘elle se sent courageuse’. Note, however, that argument reflexives are expressed by 
means of a reflexive pronoun in Chatino, as in (6) and (7). We return to the semantic 
issues raised by essence predicates in Section 3.3. 


(6) Ti? kwenq^ en? qnyi* qnya*. 
EV.MOD:only myself hit.CPL.OBJ.PRON.1SG 
‘I flagellated myself? 


3This conclusion further implies that ndi* is itself a verb, but its status as a verb is not independently 
demonstrable, given that it is a kind of cranberry morpheme, appearing as part of the essence predicate 
ndi rig? and nowhere else. 

^The question naturally arises whether an essence predicate's predicative base is ever a noun. There are 
occasional instances in which this superficially appears to be the case, but closer scrutiny leaves room for 
doubt. For example, the essence predicate tnya? riq? ‘s/he is hardworking’ seems to have the noun tnya? 
‘work’ as its predicative base, but tnya? also seems to have adjectival uses, as in 


Not nga?^ tnya* [one be Spoo working] ‘the ones who are authorities’. 
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Table 5: Some representative essence predicates in SJQ Chatino 


Essence predicate 


Gloss 


Component parts 


nkqar* riq? 
sa’ rig’ 
ndon” rig? 
sqwi^ riq? 
seng? riq? 


ndwe* riq? 
skwa? rig’ 

tqi^ rig? 

sqwe rig”/tye*? 
liqa'* riq? 

chin* nga” tye?” 
ndya? rig’ tye?” 
xqan! nga’ tye? 


“s/he remembered’ 


‘s/he was smart, fast, agile’ 


‘s/he was happy’ 
‘s/he remembered’ 
‘s/he was upset’ 
‘s/he pitied’ 

‘s/he was sad’ 
‘s/he worried’ 
‘s/he was fed up’ 
‘s/he hated’ 


‘s/he was generous/happy’ 


“s/he was taciturn’ 


‘s/he was scared/queasy' 


's/he liked' 
‘s/he felt angry’ 


[sit essence] 

[light essence] 
[standing essence] 
[exist essence] 

[CBM essence] 

[pity essence] 
[turn.around essence] 
[minced essence] 
[lying essence] 

[sick essence] 

[good essence/chest] 
[slow essence] 

[ugly feel chest] 
[like essence chest] 
[mean feel chest] 


Table 6: Some essence predicates denoting physical states in SJO Chatino 


Essence predicate 


Gloss of component parts 


‘s/he is fair-skinned’ 

‘s/he was thirsty, wheezing’ 
‘s/he is dark-skinned’ 

‘s/he is hungry’ 

‘s/he is skinny’ 

‘s/he is sturdy’ 

‘s/he is cold’ 

‘s/he is skinny’ 

‘s/he is hot’ 


lw? rig? ~ lw? tye** 
ndyi* rig? 
nta! riq? 
nteq? riq 
tit riq? 
tjog rig’ 
tlyaq? riq? 
tyjyan?? rig? 
tykeq" riq? 


2 


[clean essence ~ chest] 
[CBM essence] 

[dark essence] 
[hungry essence] 
[skinny essence] 
[strong essence] 

[cold essence] 

[skinny essence] 

[hot essence] 


Table 7: Some essence predicates with an active denotation in SJO Chatino 


Essence predicate Glosses of component parts 


's/he mocks' lyeq? riq? ~ lyeq? tye? [bully essence ~ chest] 
‘s/he placates’ tlag* riq? ~ tlaq* tye’? [cool essence ~ chest] 
‘s/he takes a liking to’ skwi! riq? [CBM essence] 
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(7) ti kwiq^ ui Tyu“ kwa’ qnyi! qin”. 
EV.MODsstill himself Ev.mop:only Peter DET hit.CPL.OBJ.PRON.35G 
‘Peter flagellated himself? 


3 Person/number marking in essence predicates 


An essence predicate exhibits person/number marking on its nominal component. Per- 
son/number marking has a complex distributional pattern in Chatino; in this section, we 
propose to situate essence predicates within this complex pattern by comparing them 
with simplex verbs, inalienably possessed nouns, and compound verbs. These compar- 
isons lead us to entertain two competing hypotheses about the morphosyntax of essence 
predicates: the possessed-subject hypothesis (according to which essence predicates em- 
body a verb-subject construction, defined by the syntax of Chatino) and the compound 
predicate hypothesis (according to which essence predicates belong to a larger class of 
predicative—mainly verbal—compounds, defined by the morphology of the language). 


3.1 Comparison to person/number marking in simplex verbs 


A prominent feature of Chatino grammar is the heavy use of tone contrasts in its inflec- 
tional system (Cruz 2011, Cruz & Woodbury 2013). Consider, for example, the paradigm 
of the simple verb sqi? ‘s/he bought’ in Table 8. In this paradigm, contrasts in aspect/ 
mood are marked in three ways: (i) a nasal prefix distinguishes the progressive and the 
habitual from the completive and the potential, (ii) a stem-initial consonant alternation 
distinguishes the completive and the progressive (both with stem-initial s) from the ha- 
bitual (stem-initial ch) and the potential (stem-initial x), and (iii) the completive and the 
progressive share one pattern of tone alternation, while the habitual and the potential 
share another. Within a particular aspect/mood subparadigm, contrasts in person and 
number are marked both tonally and—in the plural forms—by the use of personal clitics 
(in the absence of an overt subject constituent); in first-person singular and first-person 
plural inclusive forms, the verb stem also exhibits nasalization, sometimes in combina- 
tion with ablaut. Verbs fall into a number of different conjugation classes that are dis- 
tinguished mainly by their paradigms' patterns of tone alternation. Thus, despite some 
similarities, the pattern of tone alternation in the paradigm of sqi? 's/he bought' contrasts 
with the pattern of yqan“? ‘s/he washed’ observed earlier in Table 2; these contrasting 
tone patterns are given in Table 9. For extensive details on conjugation-class distinctions 
in Chatino, see Cruz & Woodbury (2013), Woodbury (to appear). 

Essence predicates participate in this system of tone contrasts, but in a different man- 
ner from simplex verbs. In the inflection of a simplex verb, a verb form's tone exhibits 
a kind of cumulative exponence, serving to distinguish (or to help distinguish) both the 
form's aspect/mood and its person/number. In the inflection of an essence predicate, by 
contrast, forms do not exhibit this sort of cumulation, but conform to the Compound 
Inflection Criterion, with the predicative base carrying the tone relevant to identifying 
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Table 8: Paradigm of the verbal lexeme sqi? ‘s/he bought’ in SJQ Chatino 


COMPLETIVE PROGRESSIVE HABITUAL POTENTIAL 
156 — sqen^? nsqen^? nchqin*® xqin*® 
288  sqi nsqi! nchqi?° xqi?0 
35G sq nsqi? nchqi* uf? 
1INCL sqen? en! nsqen? en! nchqin* xqin\* 
1EXCL sqi? wa? nsqÿ wa? nchqil*® wa? xqil® wa“? 
2PL  sqi wan! nsqi? wan! ` nchqi^ wan? xqi wan? 


3PL sq? renq! nsqi? reng nchq* reng? ` xqil* renq 


Table 9: Tone patterns of two verbal lexemes in SJQ Chatino 


COMPLETIVE PROGRESSIVE HABITUAL POTENTIAL 


sq? 18G 40 40 40 40 

‘s/he bought! 2sc 1 1 20 20 
3sG 2 32 14 14 
INCL 2-1 2-1 14 14 
1EXCL 2-42 2-42 140-42 140-42 
2PL 2-1 2-1 14-0 14-0 
3PL 2-1 2-1 14-0 14-0 

voan)? 1sG 24 24 24 

‘s/he washed’ 2scG 32 32 42 42 
38G 42 32 24 24 
INCL 42-42 1-1 14 14 
1EXCL 42-42 32-42 14-42 14-42 
2PL 42-4 32-4 24-32 24-32 
3PL 42-4 32-4 24-32 24-32 
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its aspect or mood and its nominal component carrying the tone that helps distinguish 
its person and number. (See again the inflection of ndi* riq? ‘s/he was thirsty’ in Table 1.) 


3.2 Comparison to person/number marking in inalienably possessed 
nouns 


The exponents of person and number employed in verb inflection also appear in the in- 
flection of nouns, where they serve to express the properties of an inalienable possessor. 
Thus, in the paradigm of the noun skon? ‘arm’ in Table 10, the person and number of an 
inalienable possessor are expressed by tone and—in the plural (in the absence of an overt 
possessor constituent)—by a clitic. Different nouns exhibit different patterns of tone al- 
ternation in their inflection for an inalienable possessor; thus, the tone pattern in the 
paradigm of yqan! ‘mother’ (Table 11) is different from that of skon? ‘arm’. Cruz (2016) 
distinguishes seven classes of nouns according to their patterns of tone alternation. 


Table 10: Inflection of the noun skon? ‘arm’ for an inalienable possessor's per- 
son and number in SJO Chatino (E. Cruz) 


Possessor Possessum 


1SG skon ‘my arm’ 

2SG skon’ ‘your (sg) arm’ 
3SG skon? ‘his/her arm’ 
Iplincl skon?on! ` "our arm’ 
Iplexcl skon? wat? ‘our arm’ 

2PL skon? wan! ‘your (pl) arm’ 
3PL skon? renq! ‘their arm’ 


Table 11: Inflection of the noun yqan! ‘mother’ for an inalienable possessor's 
person and number in SJQ Chatino (E. Cruz) 


Possessor Possessum 


1SG yqan? 'my mother' 

2SG voan)? ‘your (sg) mother’ 
3SG voan) ‘his/her mother’ 
1INCL yqan! an! 'our mother' 
1EXCL yqan! wa? ‘our mother’ 

2PL yqan! won! ‘your (pl) mother’ 
3PL yqan' reng! ‘their mother’ 


In view of this correspondence of form between a verb’s subject-agreement mark- 
ing and a noun’s inalienable possessor marking, one might hypothesize that an essence 
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predicate’s nominal component is in fact a subject denoting an individual’s inalienably 
possessed essence, and that its person-number marking therefore marks the person and 
number of the possessor of this essence. Indeed, rig? belongs to an inflection class differ- 
ing minimally from that of skon ‘arm’, exhibiting the same pattern of tone alternation 
as in Table 10 except in the first-person singular (where riq? exhibits tone 20 instead of 
tone 40). Accordingly, given the additional fact that Chatino is verb-initial, one might be 
drawn to conclude that the literal sense of the form ndi* renq?” (analyzed in Table 1 as ‘I 
was thirsty’) is ‘my essence is thirsty —that of a verb-subject combination whose subject 
is the noun riq? ‘essence’ inflected for a first-person singular inalienable possessor and 
whose predicate is, appropriately, the third-person singular progressive form of ndi’? 
‘be thirsty’. On this POSSESSED-SUBJECT HYPOTHESIS, an overt noun phrase apparently 
serving as the subject of an essence predicate is instead seen as a possessor, so that (i) 
not kyqyu! Ewei ‘that guy’ is a possessor in (8) exactly as in (9), and (ii) the head of the 
subject constituent in (8) is rig? '(his) essence’, paralleling tqwa“ ‘(his) mouth’ in (9). 


(8) La! riq? no! kyqyu! kwa’. 
open essence one male that 
‘That guy is friendly, talkative’ [= “That guy’s essence is open’, according to the 
possessed-subject hypothesis.] 
(9) La! tqwa* not kyqyu! kwa’. 
open mouth one male that 


"Ihe guy's mouth is open: 


This is a tempting analysis, but there is also an alternative possibility—the compouND 
PREDICATE HYPOTHESIS, according to which essence predicates are a class of compound 
predicates taking mostly experiencer subjects. In order to evaluate this hypothesis, we 
now consider person/number marking in compound predicates in SJO Chatino. 


3.3 Comparison to person/number marking in compound verbs 


Consider the compound verbs yku* jyaq? ‘s/he tasted’ [eat amount] and ykwig sla? 
[speak tiredness] ‘s/he dreamed’, whose paradigms are given in Tables 12 and 13. Each 
compound consists of a verbal element (yku* ‘s/he ate’, ykwiq* ‘s/he spoke’) and a nomi- 
nal element (jyaq? ‘amount’, sla? ‘tiredness’). The verbal element is like an essence pred- 
icate's predicative base, inflecting for aspect/mood but not ordinarily for person and 
number (though the verbal element sometimes exhibits agreement in the first person 
singular, as in Table 12); likewise, the nominal element is like an essence predicate's 
nominal component, since it carries the person/number inflection. In other words, the 
inflectional pattern again tends to conform to Rasch's Compound Inflection Criterion.? 


?Compound predicates are nevertheless somewhat varied in their properties in SJO Chatino. Compound 
verbs whose inflection deviates from the Compound Inflection Criterion may do so in more than one way. 
In the inflection of some compound verbs, person and number, like aspect and mood, are marked on the 
first, verbal element rather than on the following nominal element (e.g. snyi* chag ‘s/he dealt, negotiated’ 
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Table 12: Paradigm of the compound predicate yku* jyaq? ‘s/he tasted’ [eat 
amount] in SJQ Chatino 


COMPLETIVE PROGRESSIVE HABITUAL POTENTIAL 
156 ` ykon! jyanq? ntykon! jyanq? ntykon?? jyanq? kon” jyanq? 
256  yku“ jyaq' ntyku*? jyag! ntyku* jyaq' ku* jyaq! 
3sG ` yku* jyaq? ntyku*? jyaq? ntyku* jyaq? ku* jyaq? 
uNCL vk! jyan? ang’  ntyku? jyanq? an! ` ntyku* jyanq? an ku* jyanq? an! 
IEXCL yku* jya wa? — ntyku? jyaq? wa? — ntyku* jyaq? wa? kut jyaq? wa“? 
2PL  yku*jyaq? wan? ntyku? jyaq? wan! ntyku* jyaq? wan?" kut jyaqd? wan”* 
3PL  yku* jyaq? reng! ntyku? jyaq? reng" ntyku* jyaq? reng! kut jyaq? reng") 
Table 13: Paradigm of the compound verb ykwig* sla? [speak tiredness] ‘s/he 
dreamed’ in SJQ Chatino 
COMPLETIVE PROGRESSIVE HABITUAL POTENTIAL 
156 — ykwig! slan*® ntykwiq?? slan“° ntykwiq* slan?? tykwiq* slan*? 
2sG ` ykwiq sla! ntykwiq? sla ntykwiq? sla! tykwiq* sla! 
3sG — ykwiq' sla? ntykwiq? sla? ntykwiq' sla tykwig sla? 
INCL ykwig* slan? an! ` ntykwiq? slan? an! ntykwig* slan? an! ` tykwiq* slan? an! 
IEXCL ykwig* sla’ wa"? — ntykwiq? sl? wg" — ntykwiq? sla wa? ` tykwiq* sla wa“? 
2PL  ykwig sla’ wan!*  ntykwiq? sla won!  ntykwiq? sla’ wan"  tykwiq* sla? wan" 
3PL  ykwiq® slæ reng) ntykwiq" sla? reng" ntykwiq' sla? reng! tykwiq* sla? rend? 


As Rasch (2002) and Cruz & Woodbury (2013) observe, compound verbs in Chatino are 
quite varied in their structure, consisting of a verb paired with a stem of any of a range 
of categories to form either a head-complement structure (as in (10a)) or a head-modifier 


structure (as in (10b)), but not, in general, to form a verb-subject structure. 


6 


[grab word]); in the inflection of other compound verbs, aspect and mood, like person and number, are 
marked on the second, nominal element rather than on the preceding verbal element (e.g. xi*? skwa? 's/he 
turned (s.o.) over’ [cause be.in.elevated.position]); still others sporadically exhibit person/number marking 
on both the verbal and the nominal elements (as with ykon! jyang? ‘I tasted’ in Table 12); and yet others 
exhibit marking of aspect and mood on both the verbal and the nominal elements (e.g. sti! qo?? ‘s/he 
made fun of" [laugh with]). See Cruz & Woodbury (2013) for details concerning these deviations from the 
Compound Inflection Criterion in SJQ Chatino. 

Despite initial resemblances, a compound verb such as ykwiq^ sla? ‘s/he dreamed’ cannot be seen as the 
phrasal combination of a verb with an independent postverbal constituent. As a VSO language, Chatino 
ordinarily positions a verb's subject between the verb and a following complement or modifier, as in (i); 
but a compound verb is followed by its subject, as in (ii). Moreover, the nominal component of a compound 
verb carries the verb's person/number inflection, as in (iii), but a verb's object does not, as (iv) shows. 
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(10) a. nchu! yaq? 
's/he clapped' [hit hand] 
b. yku‘ na? 


‘s/he ate in secret’ [eat hidden] 


Whether as a verb-complement structure or a verb-modifier structure, the compound 
verb tends to conform to the Compound Inflection Criterion. This similarity between 
an essence predicate such as ndi* riq? ‘s/he was thirsty’ and a compound verb such as 
yku* jyaq? 's/he tasted' raises the possibility that essence predicates are in fact simply 
a subclass of compound predicates. If this is so, then an essence predicate's nominal 
component does not obviously function as an argument of its predicative base. Instead, 
it seems to serve as a quasi-adverbial modifier: ndi* rend! ‘I was thirsty inside’. On this 
analysis, the person/number marking on an essence predicate's nominal component is 
not an expression of possession, but (as in the compound verb yku* jyaq? ‘s/he tasted’) 
an ordinary expression of subject agreement. 

In the following section, we assess the relative adequacy of the possessed-subject and 
compound predicate hypotheses in light of four kinds of evidence. 


4 Assessing the possessed-subject and compound 
predicate hypotheses 


We now consider four important characteristics of essence predicates in SJO Chatino: 
their structural variety, their external syntax, their general lack of semantic composition- 
ality, and their relation to the distributional flexibility of subject-agreement marking. As 
we show, this evidence reveals that neither the possessed-subject hypothesis nor the 
compound predicate hypothesis accounts for the full range of characteristics exhibited 
by essence predicates. 


G) Ykwiq* nof qan! kwa skaf poema?4. 
speak.cPL one female that one poem 
“That woman spoke a poem’ 

(ii) Ykwiq* sla? not qan! kwaÿ. 
speak.cPr tiredness one female that 
‘That woman dreamt: 

(iii) Ykwiq* slan?? nka?. 
speak.cPr tiredness.1sc yesterday 


‘I dreamed yesterday. 
(iv) Ykwenq! chaq?-xlyal? ` nka? 
speak.cPr.1sc word-Castilian yesterday. 


‘I spoke Spanish yesterday: 
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4.1 Structural variety 


Essence predicates vary in their structure in at least three ways. First, there is variation 
with respect to the identity of the nominal component, which we have so far exemplified 
mainly with riq? ‘essence’. Second, there is variation with respect to the possibility of 
employing more than one nominal component within the same essence predicate. And 
third, essence predicates vary with respect to their predicative base—specifically, with 
respect to whether the predicative base has independent uses apart from its use in an 
essence predicate. Consider each of these areas of variation. 


4.11 Choice of nominal component 


The examples of essence predicates cited so far have nearly all had the noun rig? ‘essence’ 
as their nominal component. This is, indeed, the most usual nominal component for 
essence predicates. There is, however, a sizeable class of essence predicates whose nom- 
inal component is instead tye’? ‘chest’; one such predicate is none" tye*” ‘s/he dared’, 
whose paradigm is given in Table 14. Still another class of essence predicates has the nom- 
inal component qin* (whose low tone makes it frequently susceptible to tone sandhi; an 
example is the predicate skeq! qin?* ‘he (wrongly) thought or believed’ [imagine essence] 
in Table 15. 


Table 14: Paradigm of the essence predicate ngne“? tye?? ‘s/he dared’ [do chest] 


in SJO Chatino 

COMPLETIVE PROGRESSIVE HABITUAL POTENTIAL 
1SG qne? tyin”? none" tyin”? none"! tyin”? qne” tyin?° 
2SG qne”? tye” nqne** tye? none"! tye*? qne^* tye? 
3SG qne”? tye? nqne? rue? nqne^ tye? qne” tye? 
uNCL qne* tyin'in' nqne? tyin! in ` nqne^ tyin'in! qne^ tyin! in! 
1EXCL qne? tye” wa‘? none!" tye? wa"? none"! tye?? wa? qne” tye? wa? 
2PL qne”? tye? want none" tye’? want none"! rue" want qne” tye? want 
3PL qne? tye” rend) none!" tye? rengt none"! tye?? renq* one" tye" reng" 


The identity of gin* in skeq! qin?* ‘s/he wrongly thought or believed’ is debatable, 
since gin? has a variety of functions in Chatino; for example, qin* functions (with tone 
sandhi) as a third-person singular pronoun in (11a), but arguably as an animal classifier 


in (11b). 
(1) a Ye? qa% yku tykwen! qin?* sen, 
very EMPH eat.CPL bedbug oBj.PRON:38G last.night 


‘Bedbugs bit her last night” 
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Table 15: Paradigm of skeq! qin?* ‘s/he (wrongly) thought or believed’ [imagine 
essence] in SJQ Chatino 


COMPLETIVE PROGRESSIVE HABITUAL POTENTIAL 
ısG  skeq' qnya^* nskeq! qnya^* nxkeq! qnya^* xkeq' qnya™* 
2sG  skeq! qin”? nskeq! qnya”* nxkeq! qnya^* xkeq' qnya™* 
3sG  skeq! qin” nskeq' qin” nxkeq! qin”* xkeq! qin” 
uNCL skeq! qin?* nskeq! qin?* nxkeq! qin?* xkeq! qin?* 
IEXCL skeq! qwa”? nskeq! qwa* nxkeq! qwa* xkeq! qwa* 
2PL  skeq! qwan“ nskeq! qwan nxkeq! qwan xkeq' qwan 


3PL  skeq! qin^* reng) nskeq! qin™* reng"  nxkeq! qin?* reng! xkeq' qin% reng"? 


b. Yla*? qin* qo! snyiq** qin”. 
arrive.CPL ANIMAL.CLF with offspring OBJ.PRON:3SG 


"Ihe (animal) returned home with his offspring: 


Although riq?, tye?? and qin’ are not freely interchangeable as the nominal component 
of an essence predicate, they do exhibit a partial overlap in their distribution; in cases of 
overlap, the choice of nominal component may or may not serve to express a difference 
in meaning. The forms in (12) constitute a minimal triplet in which the predicative base 
sqwe? ‘good’ combines with rig’ (‘essence’), tye? (‘chest’), or qin* (‘his or her essence’), 
with each combination expressing a different meaning. 


(12 a. sqwe? riq? 
's/he was in a good mood' 
b. sqwe? tye? 
's/he was generous' 
c. sqwe? qin?* 


“s/he was affable’ 


Several cases in which riq?, tye? and qin* may be used more or less interchangeably 
are listed in Table 16a. The essence predicates in Table 16b involve riq? and ue" but have 
no alternative with oi): conversely, those in Table 16c involve rig? and qin* and have 
no alternative with tye??. Those in Table 16d involve rig? but not tye?? or qin’; those in 
Table 16e involve tye?? but not riq? or qin*; and those in Table 16f involve qin* but not 
rig’ or tye.” 

Even where the choice of nominal component corresponds to a difference of meaning, 
it is not clear that the nature of this difference is predictable. For example, the general 
sense of pity may be expressed by an essence predicate consisting of qna? and either 


"It might appear that in Table 16d, tgi* rig’ ‘s/he hates’ has a counterpart with tye, but igi tye? only 
has the literal meaning ‘her/his chest hurts’, not that of an essence predicate. 
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Table 16: Some essence predicates in SJQ Chatino 


Based on riq? 


Based on qin* 


Based on tye?? 


a. 's/he understood 


‘s/he is naive’ 
‘s/he is getting angry’ 


‘s/he misperceives’ 
‘s/he is happy’ 

‘s/he is fuzzy’ 

‘s/he is cool (not hot)’ 


‘s/he is hot? 


b. ‘s/he pities’ 
“s/he remembers’ 


‘s/he is greedy’ 
‘s/he is sad’ 

c. ‘s/he is fast or ina 
hurry’ 
‘s/he is satisfied/ 
satiated’ 

d. ‘s/he hates’ 
‘s/he knows/is aware’ 
‘s/he is worry’ 
“s/he remembers’ 
‘s/he is disgusted’ 
‘s/he is ecstatic’ 

e. ‘s/he is scared/ queasy’ 
‘s/he likes’ 
“s/he feels brave’ 
‘s/he feels angry’ 
“s/he feels sad’ 

f. ‘s/he is affable’ 
‘s/he is a thief or fast 


nkwa? jyaq? riq? nkwa? jyag? tye 24 


ntul tye? 
ntykwen? tye24 


ntu! riq? 
ntykwen? rig?* 


skeq! riq? skeq! tye? 
stu! riq! stu! tye? 
swaq?* riq 22 swaq?* tye?? 
tlaqt riq? tlaq'* tye? 
tykegl^ riq? [tykeq'* tye? 
‘s/he is angry?] 
qna? rig? qna? tye? 
sqwit riq? [sqwi* Dei? 
‘s/he holds a grudge’] 
tkong! rig? tkonq! tye?” 
xkuq*? rig? xkuq*? tye32 


saf riq?, ndla? riq? 
ylaq? rig? 


tgi* rig? 

jlyo?? rig? 

32 pig? 

nkya*? yqwi?? riq! 

stya* rig? 

styil rig? 
chin* ngat tye?? 
ndya^^ rig? tye?” 
tno* nga?4 tye?” 
xqan!0 ngat tye?” 
tqwa'* nka?* tye?? 


> 


[nkwa? jyaq? gin?* 
“s/he was tried] 

ntul qin? 

[ntykwen? gin?* 
“s/he choked on sth’] 
skeq! qin4 
swaq?* qin?? 
[tlaq'* qin? 

‘s/he is in peace] 
tykeq'* qin? 


ndla? qin? 


saf gin’, ylag*? qin* 


sqwe? qin? 
sat qin* 


realized Ess] 


stupid Ess] 
choke Ess] 


imagine Ess] 
gusto Ess] 
slovenly Ess] 
cool Ess] 


hot Ess] 


pity Ess] 
exist Ess] 


greedy Ess] 
CBM Ess] 
hurry Ess] 


satiated Ess] 


sick Ess] 
CBM ESS] 
minced Ess] 
remember Ess] 
place Ess] 
laugh Ess] 
ugly Ess] 
like Ess] 
big Ess] 
mean Ess] 
cool Ess] 
good Ess] 
light Ess] 


In the three central columns, bracketed essence predicates have a meaning different from that of the cor- 
responding essence predicate with rig”. Note that tone sandhi alters the expected tonality of third-person 
singular riq? in some of these forms. 


rig’ or tye??, and the nuanced difference expressed by this choice in (13) is not obviously 
predictable from the semantic difference between riq? ‘essence’ and tye?? ‘chest’. Note, by 
way of contrast, that the meaning of disgust expressed by the essence predicate stya* riq? 
has no counterpart with tye**: *stya* tye?. Moreover, the meaning ‘s/he is sad’ may be 
expressed by an essence predicate with either rig? or tye?” (as either xkug*? riq? or xkuq^? 
tye*?), but the meaning ‘s/he feels sad’ is expressed by an essence predicate requiring 


tye** rather than riq? (as tqwal* nka?* tye? but not *tqwa aka) riq?). 
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(13) a. Qna? oa) riq? La?'ya?^ kwa? xneq? qin’. 
pity very essence Hilaria that dog  Poss.3sc 
‘Hilaria feels bad for her dog: 
b. Qna? qa?* tye? ` La? ya^ kwa?, nkjwi? xneq? qin’. 
pity very essence Hilaria that  die.cPL dog  Poss.3sc 
"Hilaria is pitiable, her dog died: 


These facts suggest that choices among the nominal components rig’, tye? and qin‘ 
in essence predicates are often (perhaps always) determined by lexical stipulation. 


4.1.2 Combinability of nominal components 


It is often possible to use riq? and tye?? in tandem, as in Table 17.8 In such cases, it is 
tye? rather than riq? that exhibits the person-number agreement; for instance, the first- 
person singular completive form of njlya?? riq? tye? ‘s/he forgot’ is njlya? riq? tyin”? 
‘I forgot’. It is not clear that qin* appears in tandem with either riq? and tye? in its 
function as the nominal component of an essence predicate; in those cases in which it 
might appear to do so, it instead serves one of its other functions, e.g. that of an animal 
classifier (as in tkonq! riq? tye? qin”* ‘that animal is gluttonous’). 


4.13 Cranberry predicative bases 


Essence predicates also vary with respect to the independence of their predicative base. 
On one hand, there are essence predicates whose predicative base also appears indepen- 
dently (though usually not with the same meaning as the essence predicate), as in (14). 
On the other hand, there are instances whose predicative base does not have an inde- 
pendent use as a predicate, as in (15)-(18); such predicative bases are in effect cranberry 
morphemes. 


(14) a. Sqweriq? (Dei no* kyqyu!. 
good essence / chest.3sc the male (ones) 
"Ihe men are in a good mood / generous? 
b. Sqwe? no‘ kyqyu!. 
good the male (ones) 
"Ihe men are good? 
(15 a. Ndi*? riq? Xwa? kwa’. 
thirsty.PROG essence.3sG Juan that 
‘Juan is thirsty; 
b. * Ndi’? Xwa? kwa’. 
thirsty.PROG Juan that 


Sought interpretation: ‘Juan is thirsty’ 


8In Table 17 and some later tables, ‘# marks forms that we have not encountered and that aren't clearly 
acceptable, but whose acceptability to at least some speakers we do not wish to rule out. 
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Table 17: Instances of riq? used in tandem with tye’? in SJQ Chatino 


Gloss rig? + tye?? Gloss rig? + tye?” 

‘s/he forgives’ chag? tlyu? rig? tye?? 's/he likes sth' snyi* rig? tye? 

's/he forgets' jlya?? rig? tye*? 's/he is sqwe? rig? tye?” 
generous/happy' 


‘s/he knows/is aware’ 
“s/he mischievous, 
playful’ 

‘s/he worries’ 

‘s/he is open, 
extroverted’ 

‘s/he is taciturn’ 

‘s/he is fair-skinned’ 
‘s/he mocks’ 


‘s/he realizes’ 

‘s/he is happy’ 

‘s/he is worry’ 

‘s/he likes, loves’ 
‘s/he is thirsty, 
wheezing’ 

‘s/he remembers’ 
‘s/he realizes’ 

‘s/he remembers’ 
‘s/he is dark-skinned’ 
‘s/he is hungry’ 
‘s/he is feeling lazy’ 
‘s/he is weak’ 


‘s/he is stupid’ 
‘s/he gets mad’ 
‘s/he gets used to’ 


‘s/he pities’ 

‘s/he is smart, fast, 
agile’ 

‘s/he is upset’ 


‘s/he is standoffish’ 
‘s/he misperceives' 


‘s/he is fed up’ 
‘s/he takes a liking to’ 
‘s/he is desirous’ 


jlyo?? rig’ tye? 
jnya?? riq? tye? 


ndwe riq? tye?” 
la! rig? tye?? 


liqa riq! tye?? 
lw rig? tye? 
lyeq? riq? tye?? 


ndi?? rig? Gei? 

ndon*? rig? tye?? 
ndwe?? riq? tye?? 
ndya^^ riq? tye?? 
ndyi?? rig? tye?” 


nkqan* riq? tye?? 
nkwa? jyaq? rig? tye?? 
nkya*? yqwi?? riq! tye?? 
nta!t riq? tye 
nteq?? riq? tye?? 
ntjal rig’ tye? 
ntqan! riq! tye?” 


32 


ntul0 riq! tye?? 
ntykwer? rig? tye 
ntyqan| rig? re)? 


ER 


qna? riq? tye?? 


saf riq? tye 
zeng? riq! rei? 


siyeq? riq? tye? 
skeq! rig? tye? 


skwa? riq? tye?? 
skwi! rig? tye?? 


sn yal rig? tye?? 


‘s/he remembers’ 
‘s/he is strong/sturdy’ 


‘s/he is happy’ 
‘s/he is disgusted’ 


‘s/he is ecstatic’ 

‘s/he is hard-working’ 
‘s/he is frugal/takes 
care of sth’ 


‘s/he is sturdy’ 

‘s/he is greedy’ 

‘s/he placates, calms’ 
‘s/he is cold’ 

‘s/he is tired’ 


‘s/he is fully conscious’ 
‘s/he scorns’ 

‘s/he is flirtatious’ 
“s/he is smart’ 

‘s/he is slow’ 

‘s/he is hot? 

‘s/he made up her/his 
mind’ 

‘s/he is sad’ 

‘s/he is afraid’ 

‘s/he is fed up 
with/tired of’ 

‘s/he bullies’ 

‘s/he believes/is 
gullible’ 

‘s/he is 
satisfied/satiated’ 
‘s/he is shy’ 

‘s/he breaks a bad 
habit/learns a lesson' 
‘s/he is skinny’ 

‘s/he is skinny’ 


sqwi* riq? tye? 
sqye! riq? tye?? 


stul rig? tye?? 
stya* rig? rei? 


styi! riq? rei? 
t( (nya? riq? tye? 
tjeng? riq? tye?? 


tjog* rig? rei? 

tkonq! riq? tye?? 
tlaq!^ riq? tye*? 
tlyaq* rig? tye?? 
inyaq* rig? re)? 


tqa?t riq? tye? 

igi rig’ tye?? 

tsa? riq? tye?? 

tya?0 rig? tye?? 
tyaq* riq? tye?? 
tykeq'4 riq? tye?? 
wa? xtya?0 rig? tye?? 
xkuq* rig? tye*? 
sont) rig’ re)? 
xyaq? rig? tye?? 


xyuq! riq? tye?? 
ya? ntyqan* riq! Deh tye?? 


ylaq*?/ndlaq*? rig? tye?? 


yqu?0 rig’ tye?? 
sksa* riq? tye? 


sti rig? tye?? 
#tyjyan?° rig? Dei? 
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(16) a. Ndi” riq? sti! Xwa? kwa’. 
thirsty.PROG essence.3sG father.3sc Juan that 
‘Juan’s father is thirsty’ 
b. * Ndi’? stit Xwa? kwa’. 
thirsty.PROG father.3sG Juan that 
Sought interpretation: ‘Juan’s father is thirsty’ 


(17) a Ndi? reng’. 
thirsty.PROG essence.1sg 


‘I am thirsty? 
b. * Ndi”. 
thirsty.PROG 
Sought interpretation: ‘I am thirsty? 


(18) a Ndi” riq? sten!. 
thirsty.PROG essence.3sG father.1sc 


‘My father is thirsty’ 


b. * ndi? stent. 


thirsty.PROG father.1sc 


Sought interpretation: ‘My father is thirsty’ 


Table 18 lists some essence predicates whose predicative bases have independent uses, 
and Table 19, some whose predicative bases are cranberry morphemes. As inspection of 
both tables reveals, the meaning expressed by an essence predicate L usually cannot 
be equivalently expressed by using L’s predicative base by itself; either the predicative 
base of L differs in meaning from L (as in (14)) or it is simply unavailable for use as an 
independent predicate (as in (15)-(18)). 

Summarizing, we have seen that essence predicates exhibit three sorts of structural 
variety: in their choice of nominal component; in whether they exhibit one nominal com- 
ponent or two; and in whether their predicative base has uses apart from the essence 
predicate. None of these sorts of structural variety is unexpected under the compound 
predicate hypothesis. Because a compound constitutes a lexeme, two compounds may 
differ in lexically idiosyncratic ways. Despite their closely related meanings, the English 
compound nouns German shepherd and Shetland sheepdog differ in their internal logic; 
while one can imagine alternative combinations such as Germany sheepdog and Shet- 
lander shepherd, each breed has its own conventional name agreed upon on the occasion 
of its coinage. In the same way, the use of riq?, tye*?, qin* or the combination rig’ tye? 
as an essence predicate's nominal component is a matter of convention enforced by the 
lexicon of Chatino. The incidence of essence predicates whose predicative base is a cran- 
berry morpheme is further testimony to their lexical status; in such cases, the predicative 
base, like the were- in English werewolf, persists long after losing its status as an inde- 
pendent lexeme. If one instead views essence predicates as predicates having inalienably 
possessed subjects, the structural variety examined here is somewhat unexpected. On 
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Table 18: Essence predicates whose predicative bases are also used indepen- 


dently in SJQ Chatino 


Essence predicate 


Gloss 


Independent use of predicative base 


chag tly? rig? 


ksa* rig? 


ndwe* riq? 
la! riq? 

lwi riq? 
ndo*? riq? 
ndya* riq? 
nkqan* rig? 
ntal riq? 
ntjal rig’ 
ntu? riq? 


ntykwer? riq? 


qna? riq? 
sat riq? 


siyeq? riq? 
skwa? riq? 
snyi* rig? 


sqwe? riq?/tye?? 


sqwi* rig? 


sqyel* riq? 


styil riq? 


‘s/he forgives’ 


‘s/he breaks a 
bad habit/learns 
a lesson’ 

‘s/he worries’ 


‘s/he is open, 
extroverted’ 
‘s/he is 
fair-skinned’ 


‘s/he is happy’ 


‘s/he likes, 
loves’ 

‘s/he 
remembers’ 
‘s/he is 
dark-skinned’ 
‘s/he is feeling 
lazy’ 

‘s/he is stupid’ 


‘s/he gets mad’ 


‘s/he pities’ 
‘s/he is smart, 
fast, agile’ 
‘s/he is 
standoffish’ 
‘s/he is fed up’ 


‘s/he likes sth’ 


‘s/he is 
generous/ 
happy’ 

‘s/he 
remembers’ 
‘s/he is strong/ 
sturdy’ 

‘s/he is ecstatic’ 


[strong essence] 
[break essence] 
[minced 
essence] 

[open essence] 
[clean essence] 
[stand essence] 
[arrive essence] 
[sit essence] 
[black essence] 
[lazy essence] 
[stupid essence] 
[climb essence] 
[poor essence] 
[light/fast 
essence] 

[vain essence] 
[lie.elevated 
essence] 


[grab essence] 


[good essence] 


[exist essence] 


[strong essence] 


[laugh essence] 


chaq? tlyu? not kyqyu! 


ksa* yka* 


ndwe* not kyqyu! 
la! not kyqyu! 

lwi? not kyqyu! 
ndo*? not kyqyu! 
ndya^^ nof kyqyul 
nkqan* not kyqyu! 
nta no? kyqyu! 
ntja! not kyqyu! 
ntu! no? kyqyul 
ntykwer? not kyqyul 


qna? not kyqyul 
sat nof kyqyu! 


siyeq? not kyqyu! 
skwa? not kyqyu! 
snyi^ not kyqyu! 


sqwe? not kyqyul 


sqwi* no* kyqyu! 
sqyel^ no? kyqyu! 


styi! no* kyqyu! 


‘because the 
men are strong’ 
‘to break a piece 
of wood' 


"Ihe men will be 
minced. 

“The men got 
open? 

“The men are 
clean’ 

‘The men are 
standing” 

‘The men 
arrived. 

"Ihe men are 
sitting. 

*Men are black: 


*Men are lazy: 


"Ihe men are 
stupid or slow: 
"Ihe men 
climbed up: 
‘the poor men’ 
"Ihe men are 
light or fast’ 
‘Men are vain: 


"Ihe men are 
lying elevated: 
"Ihe men 
grabbed ? 


“Men are good. 


"Ihe men exist. 


“The men are 
strong” 
‘The men are 


laughing’ 
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t(j)nya? riq? ‘s/he is [work essence] no! kyqyul tnya ‘the men who are 
hard-working’ authorities’ 
ti* rig? ‘s/he is skinny’ [skinny essence] tif no* kyqyu! "Ihe men are 
skinny’ 
tjenq? riq? ‘s/he is frugalor [sticky essence] tjenq® not kyqyu! “The men are 
to take care of sth’ sticky: 
tjoq* rig? 's/he is sturdy' [strong essence] ` Goal nof kyqyul "Ihe men are 
strong” 
tkonq' riq? ‘s/he is greedy’ [ambitious tkonq! not kyqyu! “The men are 
essence | ambitious. 
tlaqt riq? ‘s/he placates, [cool essence] tlaqt no* kyqyu! “The men are 
calms’ cooled, calm’ 
tnyaq' riq? ‘s/he is tired’ [tired essence] tnyaq^ no? kyqyu! "Ihe men are 
tired: 
tqa” riq? 's/he is fully [complete tqa no* kyqyul ‘all men’ 
conscious’ essence | 
tgi* rig? 's/he hates' [sick essence] of! no* kyqyu! "Ihe men are sick? 
tya?? rig? ‘s/he is smart’ [smart essence] ra not kyqyu! ‘The men are 
smart: 
tyaq* rig? ‘s/he is slow’ [slow essence] tyaq* no! kyqyu! "Ihe men are 
slow: 
tyjyan?? riq? ‘s/he is skinny’ [skinny essence] — tyjyar?? no*t kyqyu! "The men are 
skinny? 
tykeq'* riq? ‘s/he is hot’ [hot essence] tykeq'* no? kyqyu! “The men are hot 
(temp). 
xyaq riq? ‘s/he is fed up [mix essence] xyaq? not kyqyu! “The men will be 
with/tired of’ mixed. 


that conception, the choice among riq?, tye?? and air) as subjects should seemingly be 
independent of the choice of predicate, and they should not appear in tandem (any more 
than you and they should appear in tandem to produce sentences such as * You they left). 


4.2 External syntax 


With only occasional exceptions, the components of an essence predicate can be inter- 
rupted by members of a small class of elements; their syntax relative to these elements 
is a revealing criterion for evaluating the possessed-subject and compound predicate hy- 
potheses. The class of interruptors includes the elements in (19), some of which Rasch 
(2002: 10) labels EVENT MODIFIERS; we extend his terminology to the full class. These 
may intervene between a verb and its subject, as in examples (20)-(25) (where verb and 
subject are in boldface). Correspondingly, they may sometimes intervene between an 
essence predicate's predicative base and its nominal element, as in (26)-(33), in which 
the interrupted essence predicates are in boldface. 


222 


(19) 


9 The morphology of essence predicates in Chatino 


Table 19: Essence predicates whose predicative bases are not used indepen- 
dently in SJQ Chatino 


Essence predicate Gloss 

jlya® rig? ‘s/he forgot’ 

jlyo? rig? 's/he knews/was aware' 

jnya? rig? ‘s/he was mischievous, playful’ 
ngwi riq? ‘s/he realized’ 

ndwe* rig? 's/he worried' 

ndya? rig? 's/he liked, loved' 

ndi* rig? 's/he was thirsty, wheezing' 
nteg rig? ‘s/he was hungry’ 


ntqan! riq' 
ntyqan! rig? 
ona) rig? 
senq” riq! 
skeq' riq? / qin 


‘s/he was weak’ 

‘s/he got used to’ 

‘s/he pitied’ 

‘s/he was upset’ 

‘s/he mistook, misperceived’ 


skwi! riq? ‘s/he took a liking to’ 
snyal riq? 's/he was desirous' 
stu! riq? ‘s/he was happy’ 
stya* rig’ ‘s/he was disgusted’ 
tlyaq? riq? ‘s/he was cold’ 

tsa’ rig’ ‘s/he was flirtatious’ 
tya”? rig? ‘s/he was careful’ 
xkuq? rig? 's/he was sad' 

xqnyi* rig? ‘s/he was afraid’ 
xyaq’ rig’ ‘s/he was fed up with/ tired of” 
xyud' riq? ‘s/he bullied’ 


. la 


ylaq*/ ndlaq* riq? ‘s/he was satisfied/satiated’ 


. sqwe? 


‘good, well’ 


ka24 


‘able to; expression of emphasis’ 
42 


. ye 


‘ > 
very 
24 


‘comparative’ 
24 


. qa 


< , 
very 


. kcha* 


‘crazy’ 
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(20) 


(21) 


(22) 


(23) 


(24) 


(25) 


(26) 


(27) 


(28) 


(29) 
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g. ti^, tit 

‘very, still, just’ 
h. kcha* qa! 

‘crazy’ 
Ntqan? ti^ La?'ya?  kwa? kna!, kwent? qa?^* ntqo! xqya”*. 
see.CPL EV.MOD:just Hilaria that snake loud very leave.cPr scream 
‘As soon as Hilaria saw the snake, she screamed very loudly’ 
Nkwa? tqi* ka? nten" no? yku^ kla? xi’ kang” qa”. 
be.cPL sick EV.MOD:EMPH people that eat.cPL fish bad that.ABs very 
‘People who ate the bad fish got really sick 
Ti? ykwiq! ye” silya qo? chaq? tyqo! qo! ja‘ 
EV.MOD:very speak.cPL EV.MOD:very police with.3sG to leave and NEG 
slya! qal. 
agree.CPL NEG 
"Ihe police pleaded with him to leave and he refused (to leave)? 
Ykwiq' la! sti*-qo? kwa* kef  neq'-sya? kaal, 
speak.cPL Ev.Mop:more father-saint that then type.people-justice that 
‘The priest spoke more than the authorities’ 
Ya‘? kcha‘ no qan! lyuq™ kwa’. 
go.away.CPL EV.MOD:crazy one female little that 
‘That little girl went (somewhere) aimlessly: 
Qya* kcha' qa! kyo” nka. 
fall ct. EV.MOD:crazy EV.MOD:very rain yesterday 


‘It rained crazy, unpredictably yesterday: 


Nkqan* sqwe? riq? nof gan! chaq? tsa?* 
sitting.ground.PROG EV.MOD:well essence the female (one) COMPL go.away.POT 
kwaq?. 

that 

‘That woman remembers well that she has to go to the party: 

sqwal? ` ye? chaq? tye. 


put CPL.3SG EV.MOD:very word/thing chest 

‘He really encouraged him: 

Ndon qa” riq? Xwa? kwa? ndon”? tqwa‘-tqan* qin’. 
stand.PROG EV.MOD:very essence Juan that stand.PRoc mouth-house his 
‘Juan was very happy standing in is front porch’ 

Ndon? ti‘ riq?  Xwa? kwa’ ndon”? tqwa‘-tqan* qin’. 
stand.PROG Ev.MoD:only essence Juan that stand.proc mouth-house his 


‘Juan was just happy standing in his front porch’ 
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(30) Ndon? ka” riq?  Xwa? kwa? ndon”? tqwa^-tqan* gin’. 


stand.PROG EV.MOD:very essence Juan that stand.PRoc mouth-house his 


‘Juan was sure very happy standing in his front porch’ 
(31) Ndon? la” riq? ` Xwa? kwa? ndon® tqwa'-tqan*  qin*. 
stand.PRoG EV.MOD:more essence Juan that stand.pRoG mouth-house his 


‘Juan was happier standing in his front porch: 
(32) Ndo* kcha* qa! riq?. 
happy EV.MOD:crazy EV.MOD:very essence 
‘He is crazy happy: 
(33) Stu!  kcha? *(qa?*) riq! J. Xwa? kwa? nkya’i. 
gusto EV.MOD:crazy EV.MOD:very essence Juan that go.away.CPL 


‘Juan left awfully happy: 


Strikingly, compound predicates generally resist the intrusion of an event modifier, a 
fact reflected by the unacceptability of (34). When an event modifier combines with a 
compound predicate, it generally follows it, as in (35). Yet, event modifiers in general do 
not follow essence predicates, as the evidence in (36) and (37) attests. Similarly, event 
modifiers do not typically follow the subject of a clause. Thus, in (38), the event modifier 
may intrude between the verb yli? ‘it grew’ and its subject yka?*-knyi?* kwa? ‘that tree 
graft’ (as in (38a)) but cannot follow the subject (*(38b)). The overarching generalization is 
that an event modifier typically follows the head of a predicate phrase, whether this head 
be simplex or compound. This generalization suggests that because an event modifier 
typically follows an essence predicate's predicative base, the essence predicate itself is 
phrasal. 


(34) *Ykon! ten? jyanq* skwa! qin?*. 
eat.CPL.1SG EV.MOD:only.1sG measure.1sG soup his 
Sought interpretation: ‘I only tasted his soup: 

(35) | Ykon! jyanq”* ten?! skwa! qin?*. 
eat.CPL.1SG measure.1sG EV.MOD:only.isc soup his 
‘I only tasted his soup: 

(36) a Ndon* qa” riq’. 

stand.PROG EV.MOD:very essence 
‘S/he is very happy: 
b. * Ndon? riq? qal. 
stand.PROG essence EV.MOD:very 
Sought interpretation: 'S/he is very happy: 
(37 a. Qne? sqwe tye’. 
do.cPL EV.MOD:good chest.3sG 


‘S/he dared do something: 
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b. * One? tye” sqwe*. 
do.cPL chest.3$G EV.MOD:good 
Sought interpretation: 'S/he dared do something: 


(38 a.  Ylu? sqwe? yka?^-knyi?* kwa’. 
grow.CPL EV.MOD:welltree-graft ^ that 
‘That grafted tree grew really well’ 
b. **Ylu? | yka?*-knyi^ sqwe’. 
grow.CPL tree-graft ^ EV.MOD:good 


Sought interpretation: “That grafted tree grew really well’ 


This distributional generalization about event modifiers is, however, deceptively broad, 
because event modifiers exhibit a number of idiosyncrasies in their interaction with 
essence predicates. On one hand, the event modifiers t / ti ‘very, still, just’, ko) ‘able 
to’, la^ ‘comparative’, kcha* ‘crazy’, and kcha* qa! ‘crazy’ intervene quite freely between 
the parts of an essence predicate with two components; thus, all of these event modifiers 
may appear in the contexts in (39). On the other hand, if an essence predicate has three 
or more components, these event modifiers exhibit a much more variable pattern of dis- 
tribution, as the examples in (40) suggest. 


(39) ‘s/he worries’ ndwe* — rig? 
‘s/he remembers’ nkqan* — riq? 
‘s/he is standoffish’ siyeq? — riq? 
‘s/he is daring’ tno — tye” 
‘s/he is afraid’ xqnyi* — rig? 
(40) ‘s/he forgives’ chaq? * tlyu? * rig (very idiomatic) 
‘s/he realizes’ nkwa? * bag ` V rig 
‘s/he made up her/his mind’ wa? — * xtya? ` v riq?/tye? 
‘s/he feels sad’ tqwa^ v nka? * tye” 
‘s/he believes/is gullible’ ya? * ntyqan* d riq! * ou? 


Moreover, the event modifiers sqwe? ‘good’, ye” ‘very’ and ag! ‘very’ exhibit a much 
higher degree of idiosyncrasy in their capacity to intervene between the parts of an 
essence predicate, as the examples in Table 20 show. This irregularity very likely has 
more than one cause. Some interventions are semantically improbable, e.g. * seng” sqwe 
rig’ ‘s/he is well upset’. But it also appears that essence predicates are simply more fully 
grammaticalized as tightly bound units, more strongly resisting intrusive formatives. 

We conclude that although the distribution of event modifiers exhibits a number of id- 
iosyncrasies, essence predicates resemble verb + subject combinations more closely than 
they resemble compound predicates as regards their interaction with event modifiers. 
Thus, this evidence militates in favor of the possessed-subject hypothesis and against 
the compound predicate hypothesis. 
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Table 20: Intervention of the event modifiers sqwe? ‘good’, yet? ‘very’ and qa?* 
‘very’ into essence predicates in SJQ Chatino 


ER , 

, sqwe ‘good e 3 

Gloss Essence predicate q 42 « 8 , qa% ‘very 

ye* ‘very 

‘s/he remembered’ nkqan* — rig? d Ÿ 
‘s/he was smart, fast, agile’ sat — rig’ M J 
‘s/he was happy’ ndon*? — rig? d i 
‘s/he remembered’ sqwi^ — rig? d * 
‘s/he was upset’ senq“ — riq? i M 
‘s/he pitied’ qna? — rig? B "M 
's/he was sad' xkug*? — rig? si : 
‘s/he worried’ ndwe* — riq? * S 
‘s/he was fed up’ skwa? — riq? # # 
‘s/he hated’ tqi* — riq? # # 
‘s/he was generous/happy' sqwe — riq’/tye* # * 
's/he was taciturn' liqa — riq! # A 
‘s/he was scared/queasy' — chin* — nga” Dei? * R 
chin’ nga” — tye? * * 
‘s/he liked’ ndya*? — riq? tye? x d 
ndya”* rig’ — tye? # 
‘s/he felt angry’ gan! — nga” tye? 7 # 
2 * * 


xqan” nga” — tye? 


4.3 Lack of compositionality 


As we have seen, essence predicates tend to refer psychological states, with some excep- 
tions. In a large proportion of cases, essence predicates are not transparently composi- 
tional. There are, to be sure, those whose semantics is directly deducible from their parts; 
examples are the essence predicates in Table 21. But a substantial number of essence 
predicates exhibit various degrees of departure from compositionality; the examples in 
Table 22 illustrate. The analogy of essence predicates to lexically reflexive verbs (noted 
in section 1) is again apt, since reflexive predicates are often idiosyncratic in their se- 
mantics; compare attendre ‘wait for’ to s'attendre (à) ‘expect’, douter ‘doubt’ to se douter 
‘suspect’, rendre ‘return’ to se rendre (à) ‘go to’. In the case of essence predicates whose 
predicative base is a cranberry morpheme appearing in no context other than the essence 
predicate itself (see again Table 19), there is no real question of compositionality. Here, 
too, the analogy to lexically reflexive verbs holds, since they also may be based on cran- 
berry morphemes, as in the case of French s’évanouir ‘faint’ (whose verbal base évanouir 
has no independent use). 
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Table 21: Semantically transparent essence predicates in SJQ Chatino 


Essence predicate Glosses of component parts 


‘s/he was hard-working! t(j)nya? rig? [work essence] 
‘s/he was open, extrovert’ la! riq? [open essence] 
's/he realized" ngwP rig? [awake essence] 
's/he got used to' nt(y)qan! riq? [used.to essence] 
's/he was hungry' ntenq? rig? [hungry essence] 
's/he was feeling lazy' ntja! riq? [lazy essence] 
's/he was stupid' ntu riq? [stupid essence] 
's/he misperceived' skeq' riq? [imagine essence] 
‘s/he was strong/ sturdy’ sqye"* riq? [strong essence] 
's/he was skinny' ti rig’ [skinny essence] 
‘s/he was sturdy’ tjoq* rig? [sturdy essence] 
‘s/he was greedy’ tkonq! riq? [greedy essence] 
's/he was cold' tlyaq' rig? [cold essence] 
‘s/he was tired’ tnyaq' rig? [tired essence] 
‘s/he was slow’ tyaq' rid [slow essence] 
‘s/he was skinny’ tyjyan? riq? [skinny essence] 
‘s/he was hot’ tykeq" riq? [hot essence] 
‘s/he was shy’ yqu” rig’ [embarrassed essence] 


These facts about the semantics of essence predicates might be seen as favoring the 
compound predicate hypothesis; the observed variability in semantic transparency is, of 
course, typical of compounds. But the semantic noncompositionality of many essence 
predicates might be reconciled with the possessed-subject hypothesis by regarding them 
as idioms; even the incidence of essence predicates with cranberry morphemes might be 
likened to the fact that idioms sometimes involve words that have no use outside the 
idiom (e.g. jiffy in the idiom in a jiffy, dint in by dint of, fro in to and fro). Neverthe- 
less, recurring commonalities of form and content among essence predicates might be 
argued to make them different from idioms, which tend not to possess this high degree 
of systematicity. 


4.4 Distributional flexibility of subject-agreement marking 


An important feature of Chatino subject-agreement marking is its flexibility: in the in- 
flection of a simplex verb, subject-agreement marking is expressed cumulatively with 
aspect/mood marking (as in the case of sqi? ‘s/he bought’—Table 8); but in the inflection 
of a compound predicate, aspect/mood is marked on the first member, and subject agree- 
ment is marked separately, on the second member (as in the case of yku* jyaq? ’s/he 
tasted’—Table 12). This flexibility extends even farther: If a simplex verb is followed by 


228 


9 The morphology of essence predicates in Chatino 


Table 22: Semantically opaque essence predicates in SJQ Chatino 


Essence predicate Glosses of component parts 


‘s/he was mischievous, playful’ jnya” rig? [borrow essence] 
‘s/he broke a bad habit/learned a lesson’ ksa* rig? [break essence] 
‘s/he worried’ ndwe* rig? [minced essence] 
‘s/he was fair-skinned’ lw? rig? [clean essence] 
‘s/he was standoffish’ lyaq* riq? [quiet essence] 
‘s/he mocked’ lyeq? rig [fun essence] 

‘s/he was satisfied/satiated’ ndla? rig? [fast essence] 

‘s/he was happy’ ndon”? riq? [stand essence] 
‘s/he remembered’ nkqan* riq? [sit essence] 

's/he realized' nkwa? jyaq riq? [be.able measure essence] 
‘s/he was dark-skinned’ nta'* riq? [dark essence] 
‘s/he pitied’ qnà rig [poor essence] 
‘s/he was smart/fast agile’ sa rig’ [airy essence] 
‘s/he was fed up’ skwa? rig’ [lie.elevated essence] 
‘s/he took a liking to’ skwil riq? [round essence] 
‘s/he liked sth’ snyi* rig’ [grab essence] 
‘s/he was excited’ sti! rig? [laugh essence] 
‘s/he was standoffish’ zeg! rig? [happy essence] 
‘s/he was frugal or took care of sth’ tieng? rig? [sticky essence] 
‘s/he placated’ tlaq'* riq? [cool essence] 
‘s/he was fully conscious’ tqa” rig? [complete essence] 
's/he was envious' tqi^ rig? [sick essence] 
‘s/he was afraid’ xqnyi* rig? [sad essence] 

‘s/he was fed up with/tired of xyag rig’ [mix essence] 
‘s/he bullied’ xyuq| rig [naughty essence] 


an event modifier, the event modifier may carry the verb’s subject-agreement morphol- 
ogy; thus, compare the inflection of ykwiq* ‘s/he spoke’ in Table 23 with that of ykwiq* 
ti^ ‘s/he just spoke’ [speak EVENT.MODIFIER] in Table 24.? 

The compound predicate hypothesis entails that in the inflection of an essence pred- 
icate, the nominal component (riq?, tye? or qin*, alone or in combination) functions 
very much like the event modifier tit in the inflection of ykwiq* ti* ‘s/he just spoke’: 
not as a subject, but as an adverbial or quasi-adverbial modifier of the predicate's head; 
in either instance, the modifier's adjacency to the preceding head makes it available to 
carry the head's agreement morphology. On this view, the literal meaning of an essence 
predicate's nominal component does not combine in a compositional way with the lit- 


?Note that as in the inflection of the compound verb yku^ jyaq? ‘s/he tasted’ [eat amount] in Table 12, the 
inflection of the verb + event modifier combination ykwig* tit ‘s/he just spoke’ [speak EVENT.MODIFIER] 
exhibits ablaut of its verbal element in the first person singular. 


229 


Hilaria Cruz & Gregory Stump 


Table 23: Paradigm of the verb ykwig* ‘s/he spoke’ in SJQ Chatino 


COMPLETIVE PROGRESSIVE HABITUAL POTENTIAL 
156 ` ykwenq! ntykwenq! ntykwenq?° tykwenq?° 
256 — ykwiq? ntykwiq? ntykwiq? tykwiq*? 
3sG — ykwiq' ntykwiq? ntykwiq* tykwiq* 
unci ykwenq^ en? ntykwenq! en? ntykwenq" en? tykwenq"* en? 
1EXCL ykwiq* wa“? ntykwiq®? wa“? ntykwiq* wa“? tykwiq* wa“? 
2PL  ykwiqg* wan! ntykwiq?? want ntykwiq* want tykwiq* wan* 
3PL ` ykwiq' renq* ntykwig* rengt ntykwiq* rengt tykwiq* reng) 


Table 24: Paradigm of ykwig® tif ‘s/he just spoke’ [speak EVENT.MODIFIER] in 


SJQ Chatino 

COMPLETIVE PROGRESSIVE HABITUAL POTENTIAL 
1SG ykwenq! ten?* ntykwenq! ten?* ntykwenq?? ten?* tykwenq?? ten?* 
2SG ykwiq? ti? ntykwiq?? tif? ntykwiq* tit? tykwiq* ti? 
3SG ykwiq* ti ntykwiq?? tit ntykwiq* tit tykwig* tit 


24 ,,32 


EE tykwenq^^ ten?* en 


32 32 


ntykwiq?? ten?^ en? ntykwenq?^ fer) en 
ntykwiq?? tit wa 


ntykwiq?? tit wan 


1INCL ykwiqg* ten?* en 
IEXCL  ykwig® tit wa ntykwiq* tit wa“? tykwig* tit wa“? 
2PL ykwiq* tif wan ntykwiq^ want tykwiq* want 
3PL ykwiq* tif rengt ` ntykwiq?? tit rengt ` ntykwiq* renq* tykwiq* rengt 


4 4 


eral meaning of the predicative base; instead, the nominal component has been gram- 
maticalized with a meaning something like that of English inside in experiencer-based 
expressions such as ntykwer? via"! ‘s/he got angry inside’; note again that reflexive pro- 
nouns have been grammaticalized with much the same function in expressions such as 
elle s'est fáchée 'she got angry inside’. Thus, the compound predicate hypothesis situ- 
ates the expression of subject agreement in essence predicates within a larger, indepen- 
dently motivated system in which other compound predicates and verb event modifier 
combinations also participate in parallel fashion. The distributional flexibility of sub- 
ject agreement therefore yields equivocal results. Both the possessed-subject hypothesis 
and the compound predicate hypothesis relate the person/number marker on an essence 
predicate's nominal component to an independent phenomenon in Chatino: according 
to the possessed-subject hypothesis, the person/number marking on an essence predi- 
cate's nominal component can be identified with a noun's inflection for the person and 
number of an inalienable possessor; by contrast, the compound predicate hypothesis 
entails that an essence predicate's nominal component reflects a more general pattern 
in which the person and number of a predicate's subject are marked on a nonsubject 
constituent—on the second member of a compound predicate, on an event modifier, or 
on a quasi-adverbial essence word. Given that both of these patterns of person/number 
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marking must in any event be countenanced in an adequate grammar of Chatino, it is 
not clear that the present criterion provides compelling evidence for choosing either of 
the two hypotheses over the other. 


5 Essence predicates: A formal interpretation 


Superficially, the properties of essence predicates seem ambiguous in their implications 
for a formal analysis. The essence predicate in (41) on the one hand resembles the verb- 
subject construction in (42): in both cases, the predicative word (in boldface) is inflected 
for aspect/mood and the nominal element (in italics) is inflected for person and number. 
At the same time, the essence predicate in (41) resembles the compound verb in (43): here, 
too, the boldface predicative word is inflected for aspect/mood and the nominal element 
is inflected for person and number. Finally, the essence predicate in (41) resembles the 
verb + event modifier combination in (44), where the predicative word is again inflected 
for aspect/mood and the event modifier, for person and number. 


(4) Ndi‘ rig’ not kyqyu! kwa’. 
thirsty.cPL essence.3sG one male that 
‘That guy was thirsty’ 

(42) Nkya” sti! Xwa? kwa?. 


go.baseward.cp1 father.3sc Juan that 
‘Juan’s father left? 

(43) Ykwiq' sla no'qan! kwa?. 
speak.cPr tiredness.3sc one female that 
‘That woman dreamt: 


(44) Ykwiq* tif no kyqyu! kwa?. 
speak.CPL EV.MOD.3SG one male that 


"Ihat guy just spoke: 


According to the possessed-subject hypothesis, an essence predicate is a predicate- 
subject construction comparable to that of (42): its nominal element (rig? ‘essence’ in 
(41)) is a subject, and as in (42), the inflectional marking on the subject expresses the 
person and number of an inalienable possessor; this entails that no* kyqyu! kwa? ‘that 
guy' is not the subject of (41), but instead denotes an inalienable possessor, like Xwa? 
‘Juan’ in (42). 

According to the compound predicate hypothesis, an essence predicate is a compound 
predicate comparable to those of (43) and (44). In a compound predicate, the second el- 
ement is not a subject, but is either a complement or a modifier of the predicate (as 
in (43) and (44) respectively), so that its inflection encodes the person and number of 
the predicate's subject rather than that of an inalienable possessor. This suggests that 
through grammaticalization, an essence predicate's nominal component has come to 
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serve a quasi-adverbial function, ordinarily causing the predicate to refer to the psycho- 
logical or physical state of its subject’s referent. 

In section 3, we examined four characteristics of essence predicates: their structural 
variety, their external syntax relative to event modifiers, their general lack of semantic 
compositionality, and their possible relation to the distributional flexibility of Chatino 
subject-agreement marking. As we have seen, these four criteria do not decisively favor 
either of the two hypotheses under consideration. The criterion of external syntax seems 
to favor the possessed-subject hypothesis; the criteria of structural variety and lack of 
compositionality seem to favor the compound predicate hypothesis; and the criterion of 
the distributional flexibility of subject agreement marking does not clearly favor either 
hypothesis. 

It is clear from this impasse that a third hypothesis is necessary to account for the 
properties of essence predicates. We therefore suggest the following account. 


* We regard an essence predicate as a lexeme whose predicative base and nominal 
component act as separate constituents in syntax.!° They are different from com- 
pound predicates in Chatino: their parts may be interrupted by event modifiers, 
while those of a compound predicate in general cannot. 


* We propose that every Chatino predicate has an “inflectional domain" to which 
its agreement morphology is confined. A predicate is ordinarily its own domain; 
this is true whether the predicate is simplex or compound. In the former case, 
aspect/mood and agreement are expressed cumulatively. In the latter case, inflec- 
tion is regulated by a principle of distributed exponence which we here equate 
with the Compound Inflection Criterion; according to this principle, a compound 
predicate’s inflection is ordinarily bipartite, with aspect/mood marked on its head 
and agreement marked on nonhead component. (The details of this principle are 
complicated by deviations from this ordinarily bipartite pattern, as e.g. in the first- 
person singular forms in Tables 12 and 24; see Cruz & Woodbury (2013) for discus- 
sion of the range of such deviations.) 


Certain kinds of syntactic combinations also constitute inflectional domains. If a 
simplex verb is modified by an adjacent event modifier, these two words compose 
an inflectional domain, whose inflection again involves the distributed exponence 


10 There is abundant evidence that lexemes may inflect periphrastically; for discussion, see Bórjars et al. (1997), 
Sadler & Spencer (2001), Ackerman & Stump (2004), Ackerman et al. (2011), Chumakina & Corbett (2013), 
Bonami & Samvelian (2009), and Bonami (2015). In many languages, a lexeme’s paradigm may include 
both synthetic and periphrastic realizations; that is, periphrasis is used for the realization of particular 
morphosyntactic property sets (as in Latin, where periphrastic realizations occupy the perfective passive 
cells in paradigms whose other cells are realized synthetically). An essence predicate, however, is uniformly 
periphrastic in its realization; that is, the incidence of periphrasis is not restricted to the realization of 
particular morphosyntactic property sets, but is characteristic of all of an essence predicate’s realizations. 
This view of essence predicates as lexemes whose realization is invariably periphrastic recalls the similar 
conception of Persian complex predicates proposed by Bonami & Samvelian (2010). 
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prescribed by the Compound Inflection Criterion.” In addition, an essence pred- 
icate is a lexeme whose periphrastic realization functions as an inflectional do- 
main, exhibiting the same pattern of distributed exponence. In particular, its per- 
son/number marking is situated on its nominal component and is an expression 
of subject agreement rather than inalienable possession. 


e An essence predicate's nominal component is not a subject, but has been gram- 
maticalized as a quasi-adverbial formative ordinarily serving to express the psy- 
chological or physical state of the referent of the essence predicate’s subject. 


e The structural variety of essence predicates and their semantic idiosyncrasy reflect 
their status as lexemes listed in the lexicon. 


* In most instances, the parts of an essence predicate are recognizably associated 
with independent lexemes, but this is not invariably the case. In English, the deriva- 
tional suffix -ize may transparently relate a verb with a causative or inchoative 
meaning to a nominal or adjectival stem (magnet— magnetize, popular — pop- 
ularize) but may also simply mark a causative or inchoative verb that is not syn- 
chronically related to any nominal or adjectival base (baptize, ostracize, recognize). 
Analogously, a Chatino essence predicate denoting a psychological or physical 
state may be transparently related (in form if not in content) to an independent 
predicate (as in (45)) but there are also "intrinsic" essence predicates that are syn- 
chronically unrelated to any independent predicate (as in (46)). The observed paral- 
lelism of reflexive verbs is again telling: demander ‘ask’ — se demander ‘wonder’, 
but se moquer ‘mock’ (* moquer). 


(45) skwa?riq? (<— skwa ‘s/he lay elevated’) 
‘s/he was fed up’ 
(46) ndi? rig? (*ndi? without riq?) 


's/he was thirsty' 


Other Oto-Manguean languages possess essence predicates exhibiting both similari- 
ties to and differences from those of SJQ Chatino; future work on these similarities and 
differences will likely shed additional light on the properties of this distinctive class of 
predicates. 
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"There are also cases in which the combination of a compound predicate with an adjacent event modifier 
constitutes an inflectional domain in which subject agreement is marked both on the compound predicate's 
non-head component and on the event modifier; sentence (35) is an example of this sort. 
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Chapter 10 


Why traces of the feminine survive 
where they do, in Oslo and Istria: How 
to circumvent some “troubles with 
lexemes" 

Hans-Olav Enger 


The paper examines a surprising parallel in the development of the feminine gender in Oslo 
Norwegian on the one hand and Istro-Romanian (spoken in Croatia) on the other. In both 
cases, the feminine gender is lost on all ‘normal’ gender markers, but a trace of the feminine 
remains on the definite suffix, which is the ‘last redoubt’ of the feminine gender. An attempt 
is made to link this development to a slightly modified version of the Agreement Hierarchy. 
It is suggested that the Hierarchy may be linked to grammaticalisation, and that we should 
not draw too strict lines between different kinds of agreement. 


1 The main point 


The starting-point for what follows is a parallel between Norwegian as spoken in Oslo, 
Norway, and Istro-Romanian, as spoken on the Istrian peninsula in Croatia. In both cases, 
feminine agreement is reduced, diachronically, and in both cases, traces of the feminine 
remain longer in one specific place, namely word-internally, than elsewhere. Why would 
there be such a parallel? I suggest an account which involves a modified version of Cor- 
bett's (1979, 2006) Agreement Hierarchy. In brief, the ‘definite article’, when it is a suffix, 
has a different status than other elements that signal gender. Furthermore, Furthermore, 
an examination of the hierarchy reveals that it may be 'anchored' in the workings of 
diachrony and psycholinguistics. 


Hans-Olav Enger. Why traces of the feminine survive where they do, in Oslo 
and Istria: How to circumvent some "troubles with lexemes". In Olivier Bonami, 
Mil Gilles Boyé, Georgette Dal, Hélène Giraudo & Fiammetta Namer (eds.), The lexeme 
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2 The empirical background 


2.1 Oslo 


In the Oslo dialect of Norwegian, a change has taken place. A century ago, this dialect 
had three genders (in the singular, like German).! Compare (1): 


(1) Three genders in Oslo dialect ca. 1900; examples in Norwegian Bokmal orthogra- 
phy. 
a. en liten gutt, en fin gutt, denne gutten, ikke noen gutt 
a.M little.m boy, a.m fine.Mr boy, this.mr boy.DEF.sc.{m}, not any.M boy 


b. en liten stol, en fin stol, denne stolen, ikke noen stol 
a.M little.M chair, a.m fine.vr chair, this.ĪmF chair.DEF.sG.{m}, not any.M chair 


c. ei lita jente,ei fin jente, denne jenta, ikke noa jente 
a.r little.r girl, ar fine.mr girl, this.mr girl.pEr.sc.[r], not any.F girl 

d. ei lita jakke, ei fin jakke, denne jakka, ikke noa jakke 
a.F little.r jacket, a.F fine.wr jacket, this.mr jacket.pEr.sc.[r], not any.r jacket 

e. et lite barn, et fint barn, dette barnet, ikke noe barn 
a.N little.N child, a.n fine.n child, this. child. DEF.sc{n}, not any.w child 

f. et lite hus, et fint hus, dette huset, ikke noe hus 


a.N small.N house, a.n fine.N house, this.N house.pErF.sc[N], not any.N house 


There is clear evidence for three genders, masculine (1a,1b), feminine (1c,1d) and neuter 
(1e,1f). The formal differentiation between the masculine and the feminine is not so 
clearly marked as that of both of them in opposition to the neuter. The masculine- 
feminine distinction is not realised on all associated words, but it is realised on some 
very central determiners and a few highly frequent adjectives, such as the adjective liten 
‘small’, which is overdifferentiated; showing ‘too many’ contrasts (cf. Corbett 2007). By 
contrast, the adjective fin ‘fine’ is ‘regular’, showing only the opposition neuter vs. non- 
neuter, in the same way as the proximal determiner denne.” In such cases, I have assigned 
the value ‘mf’. 

The status of the suffix in the definite singular of nouns is intriguing (see e.g. Enger & 
Corbett 2012 and Section 3.2.3 below). Genders are defined as classes of nouns reflected 
in the behaviour of associated words (Corbett 1991). Suffixes do not count as 'associated 
words’; and yet, in the nouns in (1), the suffixes are in a strict 1:1 relation with the gender 
exponents. If a noun takes -a in the definite singular (e.g. jente ‘girl’), it will invariably 
also take ei ar, lita ‘little.F’, noa ‘any.F’ and other ‘associated words’ expected from a 
feminine: if it takes -en in the definite singular, it will also take en ‘a.m’, liten ‘small.m’, 


'The following draws on Larsen (1907) and Lødrup (2011) in particular; but cf. also Enger (2004a,c) and 
Opsahl (2009). 

"There are also adjectives in which the gender distinction does not show at all, e.g. rosa 'pink', gammaldags 
‘old-fashioned’. 
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noen 'any.M', as expected from a masculine. This is the background for the use of curly 
brackets in (1). 

In Oslo these days, there is no longer any evidence from ‘associated words’ in favour 
of a separate feminine gender. In other words, the feminine agreement has been ousted 
by the old masculine. The old suffix -a, by contrast, is retained. The system, at least for 
most of the speakers, is as described in (2): 


(2) Two genders in recent Oslo dialect (compare example 1; again, examples given in 
Bokmal) 


a. en liten gutt,en fin gutt, denne gutten, ikke noen gutt 
a.M small.M boy, a.M fine. boy, this.M boy.DEF.sG.{M} not any.M boy 


b. en liten stol, en fin stol, denne stolen, ikke noen stol 
a.M small.M chair, a.M fine.M chair, this.M chair.DEF.sc.{m} not any.M chair 


c. en liten  jente,en fin jente, denne jenta, ikke noen jente 
a.M small. girl, a.m fine. girl, this.M girl.DEF.sG.{?} not any.M girl 

d. en liten jakke, en fin jakke, denne jakka, ikke noen jakke 
a.M small. jacket, a.m fine jacket, this.m jacket.DEF.sc.{?} not any.m jacket 

e. et lite barn, et fint barn, dette barnet, ikke noe barn 
a.N small.n child, a.n fine.n. child, this.N child.pEr.sc.[N] not any.w child 

f. et lite hus, et fint hus, dette huset, ikke noe 
a.N small.N house, a.n fine.N house, this.NEUT house.DEF.sG.{N} not any.N 
hus 
house 


The usual interpretation of the data in (2), as indicated by the glossing, is that the 
old feminine is no longer a separate gender in the Oslo dialect, ‘merely’ an inflection 
class (Lødrup 2011, cf. also Enger 2004a,c and many others)? The definite singular suffix 
-a might seem 'the last redoubt' of the old feminine, cf. (2c-d), and some would like to 
analyse it is a gender marker (cf. Section 3.2.3 below); that is the reason for using “{?}’. 

A development from gender to inflection class is far from unique; such developments 
have been referred to as grammaticalisation (cf. Lehmann 1982, 2016, Wurzel 1986). The 
old feminine is changing into an inflection class also in some other Norwegian dialects, 
such as Tromsø (Westergaard & Rodina 2015, 2016), and it is absent also in some contact 
varieties in the North (Conzett et al. 2011). Essentially the same development is found in 
the Jämtland dialect in Sweden (Van Epps & Carling 2017).*? 


There is considerable discussion about whether to take pronouns into consideration for the purposes of 
gender agreement. At this stage, they are left out, for expository reasons (but cf. Section 4.2 below). 

^On the whole, it is pointless to debate whether dialects in Scandinavia are dialects of one or the other 
language, since Scandinavia generally counts as one dialect continuum. The point of interest is the parallel 
between Jamtland and Oslo. 

5A next step after the system shown in (2) is that also the old -a suffix is lost. In that way, old masculines 
and old feminines become indistinguishable. This is found with some Oslo speakers, who will say en liten 
jakke, jakken, just like en liten gutt, gutten. (Essentially the same system is found in “standard” Swedish and 
Danish.) 
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2.2 Istro-Romanian 


We now turn to Istro-Romanian, which is “spoken in some localities in north-eastern Is- 
tria (Croatia) to the south of Mt Ucka, and in the town of Zejane to its north. Its speakers 
probably descend from pastoral communities originally resident in Bosnia, Serbia, and 
Croatia in the late Middle Ages, who settled in Istria from about the fifteenth century. The 
language’s place of origin, and whether it originally broke away from varieties spoken 
in the Romanian lands, or from those spoken in the Balkans, or represents dialect mix- 
ing, remain controversial. There are today perhaps 200-250 speakers in Croatia, mainly 
elderly and all bilingual in Croatian” (Maiden 2016b: 91). 

The number of genders in Istro-Romanian might be disputed. The system used to be 
essentially the same as that of Romanian, and the number of genders in Romanian has 
been much disputed (cf. Corbett 1991, Maiden 2016a,d, Loporcaro 2016). Besides the mas- 
culine and the feminine, which are uncontroversial, there is also, at least according to 
Corbett (1991) and Loporcaro (2016), a third gender. This gender has been referred to as 
‘neuter’ and as ‘genus alternans’. This gender has practically no morphology of its own, 
as Table 1 shows. 


Table 1: Romanian gender. 


Singular Plural 


trandafir frumos trandafiri frumoşi (beautiful rose, M) 


casa frumoasă case frumoase (beautiful house, F) 
palton frumos paltoane frumoase (beautiful coat, N) 
Singular with definite article Plural with definite article 


pom - pomul (tree - the tree, M) pomii (the trees) 
cutie - cutia ` (box - the box, F) cutiile (the boxes) 
loc - locul (place - the place, N) locurile (the places) 


The ‘neuter’ patterns with the masculine in the singular, with the feminine in the plu- 
ral. Thus, it alternates between the two, hence the label genus alternans. In Table 1, some 
endings have been boldfaced so as to show this. According to Martin Maiden (personal 
communication, and 2016c), in Istro-Romanian, while the masculine and the feminine 
happily persist, 


The plural endings which originally selected feminine gender (alternating with 
masculine singulars) have lost the alternating gender and the relevant nouns have 
become masculine in singular and plural alike, except that they may continue to 
have a distinctively feminine definite article (suffixed, as in Norwegian) ... this 
could indicate that the definite article is in a rather different category from other 
agreeing elements, at least when it is enclitic to the noun (Martin Maiden, e-mail). 
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The different status of the ‘definite article’, when it is ‘inside’ the noun (word-internal), 
is indeed a central theme of this paper. 


2.3 Clitic or suffix? 


It is necessary to address the status of the ‘definite article’, in both Istro-Romanian and 
Norwegian. Traditional wisdom has it that the Romanian ‘definite article’ is a clitic, but 
Ledgeway (2016a,b) has argued that it is not a syntactic ‘head’ at all, but rather a piece 
of inflectional morphology, expressing definiteness. Apparently, the Romanian definite 
article shows many of the characteristics of inflection, such as fusion, obligatoriness, 
defectiveness and erratic allomorphy. This conclusion carries over to Istro-Romanian. 

The Norwegian ‘definite article’ has traditionally been analysed as a suffix, but some 
would analyse it as a clitic (e.g. Lahiri et al. 2005). However, Ledrup (2016) presents 
good arguments for the traditional suffix analysis (cf. also Faarlund 2009): There are 
unexpected ‘gaps’ in the inflection in the indefinite singular. Nouns that do not have to 
take a definiteness suffix, even when they quite clearly occur in the definite, and these 
nouns do not form a natural class. Consider first (4a,b): 


(3)  Gutten eri byen og sjekker kneet 
Boy.DEF.SG.{M} is in town-DEF.sG.{M} and checks knee-pzr.sc.[N]) 


"Ihe boy is in town getting his knee checked’ 
A corresponding sentence without the definiteness suffixes, as in (4), would be strange: 
(4) * Gutt er i by og sjekker kne 


Intriguingly, if the words for ‘boy’, ‘town’ and ‘knee’ are replaced with the words for 
‘dean [of a faculty at a university]’, ‘city centre’ and ‘larynx’, grammaticality judgments 
would be the opposite, as (4c,d) show: 


(5) a. Dekanus eri sentrum_ og sjekker larynks_ 
Dean is incentre checking larynx 
"Ihe dean is in the [city] centre getting his larynx checked’ 


b. * Dekanusen eri sentrumet og sjekker larynksen 
Dean.DEF.sG.{m} is in centre.DEF.sG.{N} checking larynx.pEr.sc.[M] 


Thus, there are ‘gaps’ in the marking of definiteness, and that does not square with 
clitic status. Some (mainly learned) nouns denoting (mainly) people and body parts do 
not take the definite article - but these nouns do not make up a natural class, as Lødrup 
(2016) shows. In other words, not all learned nouns behave like dekanus, sentrum, larynx, 
and not all nouns that can behave like dekanus are learned, Latinate nouns. Compare (6): 


(69 a. Dekanus har foreslatt at ... 
"Dean has suggested that ..: 
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b. * Diakon/ Diakonen har foreslått 
‘Deacon has suggested’ 

c. *Leder/ Lederen har foreslatt 
‘Chief has suggested’ 

d. Avdelingsleder/Avdelingslederen har foreslått 
‘Head of section has suggested’ 


The noun diakon ‘deacon’ is a clear loan, but it behaves like gutt ‘boy’ and not like 
dekanus ‘dean’, cf. (6b). Conversely, there is nothing Latinate over the word avdelingsle- 
der ‘head of section’, which still can behave like dekanus, cf. (6d) (and contrasts intrigu- 
ingly with the simplex leder, cf. (6c)). 

One might add other arguments for taking the ‘article’ as a suffix, including the ob- 
servation that the ‘definite article’ is restricted to one word-class, and that it cannot be 
skipped on co-ordinated nouns, cf. (7a), thus differing from the ‘possessive’ -s, usually 
considered a clitic, cf. (7b): 


(7 a. gutten og faren - not “gutt og faren 
‘the boy and the father’ 
b. fars og mors - far og mors 


‘father’s and mother's 


Also, at least for some Oslo speakers, the stem vowel of the one noun ‘mother’, mor is 
changed from the indefinite /mu:r/ to the definite /mura/, and that is unexpected under 
a clitic analysis, whereas inflectional suffixes can induce irregularity. 


2.4 Parallels in support 


The diachronic parallel between Oslo and Istria is interesting. In both cases, a word. 
internal’ element is where traces of the feminine stay on the longest. In Oslo, -a lingers 
on as a suffix long after agreeing words such as lita ‘little.F’, noa some rand even ei ar" 
have been lost. In Istria, the suffix is the last relic of the old genus alternans. The parallel 
is close enough to warrant further examination, and the reason is probably structural; 
contact can safely be ruled out. Some other innovations in Scandinavian may be noted 
in support. 


2.4.1 Danish 


For a couple of centuries, Standard Danish has had a two-gender system, with an op- 
position between masculine (or common gender, a merger of the former feminine and 
masculine) and neuter (cf. Section 2 and Footnote 5). Historically speaking, the Danish 


®Some readers may wonder if the change in stem vowel quantity for ‘mother’ might be some kind of com- 
pensatory lengthening, which might be analysed as phonologically rather than morphologically triggered. 
This seems unlikely, as the example is isolated. 
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system has influenced the Oslo development, although the change in Oslo is probably 
not due to contact only (Enger 2004c). 

In current Danish, the mass nouns vodka ‘vodka’, cement ‘cement’ are usually mascu- 
line (as are their cognates in Norwegian). However, alongside the expected masculine 
determiner den, as in den vodka ‘the.m vodka’, den cement ‘the.m concrete’, Danish also 
allows for det vodka 'the.N vodka’, det cement ‘the.N concrete’ with neuter agreement on 
the attributive determiner. These nouns thus allow for alternative agreement patterns; 
they have become hybrids, in Corbett's (1991, 2006) terminology. The neuter agreement 
in det vodka, det cement has been called semantic agreement (Hansen & Heltoft 2011: 232, 
Enger 2013). 

On this point, Danish goes further than its Scandinavian sister languages/dialects (cf. 
also Josefsson 2014b). Danish, Norwegian and Swedish allow ‘pancake sentences’, in 
which there is neuter agreement on the predicative adjective, even if the subject appears 
to have another feature. Consider example (8): 


(8) Vodka  (det) er godt 
Vodka(m) (it.NEUT) is good.NEUT.SG 


At least according to one analysis (e.g. Enger 2004b, Wechsler 2013, Haugen & En- 
ger forthcoming), pancake sentences can be considered semantic (or ‘referential’) agree- 
ment.® 

The same nouns, e.g. vodka, sement (Norwegian spelling)/ cement (Swedish and Dan- 
ish spelling) can take a neuter pronoun in Swedish, Norwegian and Danish, and they can 
take a predicative adjective in the neuter, as in (8). However, Swedish and Norwegian do 
not allow “det vodka; in other words, they do not allow semantic agreement inside the NP 
in such examples. Danish allows det vodka ‘that.NEUT vodka’, det cement ‘that.NEUT con- 
crete’ with semantic agreement, but even in Danish, only cementen ‘concrete.DEF.sG{M}’, 
vodkaen ‘vodka.DEF.sG{m}’ with the suffix associated with the masculine is accepted. In 
other words, also in Danish, “cementet, *vodkaet is ruled out; the possibility of semantic 
agreement (neuter) found on the attributive determiner has not (yet?) spread to the suf- 
fix. Thus, the suffix is again more resistant against diachronic change than other, more 
word-like elements. 

At this stage, a caveat is in order. I have used the terms ‘pronoun’ and ‘determiner’, 
but words that can be used pronominally in Norwegian can typically also be used as 
determiners, compare, for example the two uses of det in (9): 


(9 a. Hva synes du om det huset? 
What think.prs you of that.NEUT house.DEF.SG{NEUT}? 


‘What do you think of that house?’ 


7The terms ‘hybrid noun’ and ‘semantic agreement’ and ‘referential agreement’ have been debated (cf. Dahl 
1999, Corbett 2006), but for present purposes, we may set this aside. 

5For further discussion of pancake sentences, see e.g. Corbett & Fedden (2016), Enger (2013), Josefsson (2009, 
2014a), Haugen & Enger (2014). 
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b. Det er fint 
It.NEUT be.PRs fine.NEUT 


‘Tt is fine’ 


Thus, it is far from obvious that there is a categorical split between pronouns and 
determiners (Kristoffersen 2000, Halmøy 2016: 162-3 et passim, see also Hansen & Heltoft 
2011: 183 for Danish), and in this paper, the terms ‘pronoun’ and ‘determiner’ refer to use 
only. 


2.4.2 A peripheral change in (some) Norwegian Bokmal 


Norwegian Bokmal presents many examples of a slightly different, but related kind (see 
also Enger & Corbett 2012, Enger 2015). Here, a new semantically motivated feminine 
gender agreement is found, formerly not available, as in the examples in (10a, 10b) (from 


the web): 


(10) a. Ei god venn som alltid er der 
a.F good friend who always is there. 


‘a good friend who is always there’ 


b. B. har fått ei lærer som og hun... 
B. has got a.F teacher who ...and she ... 


‘B. has got a teacher who ... and be 


The nouns venn ‘friend’, lærer ‘teacher’ are masculines in traditional three-gender sys- 
tems, so one would expect the determiner en. Since the masculine is ousting the feminine, 
in many dialects (cf. Section 2 above), one would not expect the opposite to happen as 
well; it is strange to see the feminine ei spread. So a natural reaction may be to dismiss 
examples such as (10a, 10b) as wrong. 

However, data like these do occur, if not terribly frequently (even in the speech of 
some, although I have only anecdotal evidence on this point), and the examples are not 
random. They relate to nouns denoting humans, and whenever the feminine is employed, 
it refers to females. The data therefore deserve to be taken seriously, and their immediate 
interest is that while the article/determiner can be changed, from en venn to ei venn, from 
en laerer to ei laerer, the suffix is not changed accordingly. The same two authors that pro- 
duced ei venn and ei lærer, write vennen ‘friend.DEF.sG.{m}’, læreren ‘teacher.DEF.sG.{M}’ 
(and not *venna, * laerera) respectively, even if reference clearly is made to a woman. (See 
further Section 4.1 below.) 

So even if these nouns change the attributive determiner en to ei, they do not change 
the suffix -en to -a. Again, the suffix is more resistant towards change than the other 
elements, which, unlike the suffix, are independent words. 
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3 Suggested analysis 


3.1 The original Agreement Hierarchy 


The similarities surveyed in Section 2 are probably not accidental, and one way ahead is 
to relate them to the Agreement Hierarchy (Corbett 1979, 2006). This hierarchy involves 
four ‘pegs’ for four different kinds of agreement controllers, as shown in Figure 1. 


4 d 


Attributive » Predicative » Relative » Personal Pronoun 


Figure 1: The Agreement Hierarchy. 


Corbett (2006: 207) says that for "any controller that permits alternative agreements, 
as we move rightwards along the Agreement Hierarchy, the likelihood of agreement 
with greater semantic justification will increase monotonically". In other words: The 
possibility for semantic agreement will increase towards the right; if possible on the 
predicative, it will be possible on the personal pronoun too, but not necessarily the other 
way around. A case in point is the agreement patterns noted for some Scandinavian mass 
nouns (Section 2.4). Given that Danish allows semantic agreement on the attributive de- 
terminer (det vodka), semantic agreement is expected also on the predicative. In standard 
Swedish, semantic agreement is possible on the predicative; so, semantic agreement is 
expected also on personal pronouns, but it is no problem that semantic agreement is 
outlawed on the determiner. 

While Corbett's hierarchy was originally formulated as a synchronic constraint, it 
"can easily be adapted to the diachronic perspective, predicting gender exponents to 
begin and/or complete the transition from lexical [syntactic] to referential [semantic] 
assignment the earlier, the further they are located on the right of the implicational 
hierarchy", as noted by Dolberg (2014: 55). 


3.2 The revised Agreement Hierarchy 
3.2.1 Suggestion and background 


The suggestion now is to modify the hierarchy, at least for some purposes, by expanding 
it with an additional position or ‘peg’, which is ‘word-internal’, cf. Figure 2. 


À d 


"Word-Internal' > Attributive > Predicative > Relative > Personal Pronoun 


Figure 2: Modified Agreement Hierarchy. 
The idea is that the Agreement Hierarchy has to do with 'tightness' of grammatical 


relations, and thus with grammaticalisation, and that grammatical relations generally 
are tighter inside the word than inside the phrase, and tighter inside the phrase than 
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outside it, — and across clauses weaker still. The idea that the Agreement Hierarchy may 
have to do with grammaticalisation is far from original (cf. Lehmann 1982, 2016), but it 
has not received quite the attention it merits (though see Jobin 2004). 

When suggesting the hierarchy, Corbett (1979: 217) noted that it did not match then- 
current syntactic frameworks too well, and suggested that it was an "independent feature 
of natural languages". Nearly forty years later, this suggestion seems less appealing. As 
Dolberg (2014: 58) notes, from a diachronic perspective, Corbett's Agreement Hierarchy 
“is to be credited with being of remarkable predictive accuracy, yet it does not yield 
much in the way of explanatory power: even though it reliably tells us what to expect 
to happen in the exponents of changing gender systems, it provides little information 
regarding why this is so? 

It would if the Agreement Hierarchy could be grounded in something else. In recent 
years, many linguists have come to see constraints "not so much as constraints on possi- 
ble synchronic grammars [than, HOE] as constraints on diachronic developments" (Tim- 
berlake 2003: 194, cf. also e.g. Evans & Levinson 2009). On such a view, at least some of 
the explanatory burden is shifted from synchrony towards diachrony. 

According to Lehmann (1982, 2015 and elsewhere), there is a unidirectional move- 
ment from semantic agreement towards syntactic agreement, but not vice versa. In other 
words, what starts out as semantic agreement may become 'syntacticised' and less mean- 
ingful; changes in the other direction should not occur. Becoming somehow 'semanti- 
cally reduced' is a standard criterion for grammaticalisation, another is becoming more 
obligatory. Both criteria would seem to hold for ‘syntactic’ agreement compared to se- 
mantic; Wechsler (2009) even prefers the term ‘grammatical’ agreement. This fits with 
the broad picture of grammaticalisation; it is largely unidirectional. On the assumption 
that diachronic tendencies motivate the Agreement Hierarchy, the hierarchy can be re- 
lated to a larger framework, viz. that of grammaticalisation. 


3.2.2 Objection I: motivating the fifth peg 


The fifth peg may seem like cheating, for two reasons. Firstly, ^word-internal (or noun- 
internal) agreement’ is a controversial notion.’ The other ‘pegs’ are syntactic heads; the 
suffix in Norwegian is morphology (cf. Section 2.3), and the idea of ‘morphology-free 
syntax’ is well-established (Zwicky 1992, Corbett 2014). Secondly, merely positing a fifth 
peg does not automatically solve the problem; the new peg does require some kind of 
motivation. As the Agreement Hierarchy has already been linked to grammaticalisation 
(Section 3.2.1), the latter problem will be discussed first. 

There are different versions around of the Agreement Hierarchy. Kópcke et al. (2010) 
try to make their version less system-internal and more functional. In the words of Dol- 
berg (2014: 18), they “assign pragmatic functions to the syntactic categories identified by 
Corbett, resulting in this altered agreement hierarchy: specifying - modifying - predi- 
cating - referent-tracking”. Dolberg (2014: 58) argues that it makes sense to consider this 
version of the hierarchy together with Corbett’s original: 


?While Stolz (2007) argues at length in favour of the notion of word-internal agreement, the point I am 
trying to make here is orthogonal to his. 
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[M]otivating this expected pathway of referential agreement encroaching into (pre- 
dominantly) lexical gender systems is comparably straightforward in the func- 
tional version of the Agreement Hierarchy [Kópcke et al. 2010], simply by taking 
recourse to the basic surmise that changes will occur generally first in those areas, 
in which the change is most conducive and/or least detrimental to language use. 
Thus, the underlying assumption of the functional version of the Agreement Hi- 
erarchy is that personal pronouns changing to referential gender yield the largest 
gain in freeing cognitive capacity, as their lexical gender needs no longer be re- 
membered over comparably long stretches of discourse, because the appropriate 
pronoun form is now simply being derived from attributes of the referent, or, more 
precisely, the interlocutor's mental representation thereof, which needs to be kept 
in working memory anyway. This putative gain then gradually diminishes the fur- 
ther one moves to the left in the Hierarchy. (Dolberg 2014: 58) 


Relating the Agreement Hierarchy to grammaticalisation (cf. Section 3.2.1) means re- 
lating it to the ‘tightness’ of grammatical relations; one of Lehmann's (2015: 131) ‘param- 
eters’ of grammaticalisation is bondedness or ‘tightness’: “The cohesion of a sign with 
other signs in a syntagm will be called its bondedness; this is the degree to which it 
depends on, or attaches to, such other signs? Lehmann (2015: 157) says the syntagmatic 
cohesion or bondedness of a sign “is the intimacy with which it is connected with an- 
other sign to which it bears a syntagmatic relation". 

The relation between a noun and an attributive adjective is tighter, more "intimate", 
than that between a noun and a predicative adjective, which is in turn tighter than that 
between a noun and a pronoun. Elements in attributive position are inside the noun 
phrase, and the syntax of the phrase is, as a rule, tighter than that of the clause and sen- 
tence. The relation between a pronoun and its antecedent is typically ‘loose’, compared 
with that of determiner to noun, hence, semantic agreement is more characteristic of 
pronouns. A related ‘parameter’ for Lehmann (2015: 131) is that of syntagmatic variabil- 
ity; the possibility of ‘shifting around’ a sign in its construction. This also fits with the 
Agreement Hierarchy, and the relation between noun and suffix is tighter than any of 
the relations in Corbett’s original hierarchy. The suffix has to occur immediately to the 
right of the noun stem; nothing else can intervene. 

This fits with the suggestions made by Kópcke et al. (2010) and Dolberg (2014). Pro- 
nouns are unlikely to be ‘stored’ in the mental lexicon together with their controlling 
noun, and this opens for semantic agreement. By contrast, it seems likely that suffixes 
are stored with their controller, as some idioms show. Two set phrases in Norwegian are 
fa sparken ‘get the sack, be fired’ and gi sparken ‘sack, fire’. The verbs fa and gi mean 
‘get, receive’ and ‘give’ respectively, and they are both very general and frequent, but 
the noun sparken only rarely occurs outside these two idioms; it is difficult to ascribe 
a meaning to sparken in isolation. There is no indefinite singular; there are no plurals. 
Even if the suffix indicates a masculine noun, there is no noun phrase “en spark.'® If the 
whole fa sparken were stored, that would weaken the case for saying that only stem and 


V Strictly speaking, there is a noun en spark ‘kicksled, spark’, but it is a homonym, synchronically. 
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suffix are stored together, but sparken can marginally be found on its own, cf. examples 
from the web in (11): 


(11) Examples of sparken without fa: 


a. Facebook betyr ikke sparken ‘Facebook does not [have to] mean the sack’ 


b. dermed ble det sparken ‘lit. thereby became it sack; so I was sacked’ 


Similar examples include snurten, which it hardly makes sense to translate in isolation; 
it is mostly known from the idiom se ikke snurten av ‘not see anything/the least bit of’. 
This noun does occur marginally in some other contexts, though, even without negation, 
cf. (12), again, examples are taken from the web: 


(12) Examples of snurten without ikke (and without av): 


a. aldri sett snurten av ‘never seen anything of’ 
b. uten à se snurten til ‘without seeing anything of’ 


c. ... kan man skimte snurten av peisen `... can one spot a little of the fireplace’ 


Scandinavian diachrony presents at least one example where the definite singular suf- 
fix has become part of the stem. This is the noun meaning ‘world’. Swedish has värld, 
Danish has verden (cf. def. sg. världen vs. verdenen). The Danish cognate is an innova- 
tion; the old det zo. suffix has become part of the stem. Pragmatically, this makes sense; 
for most speakers, there is only one world (at least most of the time). Istro-Romanian 
also presents examples where the plural ‘definite article’ has become lexicalised (Maiden 
2016c). It is difficult to think of an example where the pronoun would merge with the 
stem in the same way, also because pronouns do not typically occur next to a noun (as 
they occur ‘instead of a noun’). 

It is more difficult to come up with examples in which the determiner must be stored 
than where the suffix must, but there are some. The phrase ikke det spett means 'not 
the least’, and one might expect the noun spett to inflect as a regular neuter would. Yet 
at least in my Norwegian, there is no definite singular form, nor any plurals. For spett, 
then, it seems the determiner is stored with the noun." An obvious question is if ikke 
‘not’ also has to be stored, but aldri sett det spatt ‘never seen no nothing’ shows it does 
not have to. 

It probably does not happen often that the pronoun is stored together with the noun; 
this probably happens more often with the determiner. It seems even more likely that 
suffixes be stored with the corresponding noun (also because suffixes are ‘salient’, cf. 
Section 3.2.3 below).'? 

In Section 3.2.1, we considered an argument in favour of seeing the Agreement Hierar- 
chy in terms of grammaticalisation having to do with ‘semantic reduction’. According to 


1 Admittedly, dictionaries also mention et spett. But that is unknown to many speakers, and dictionaries tend 
to strive for completeness, sometimes at the expense of actual usage. 

PThe suggestion that determiner or affix may be stored together with the noun does not exclude the idea 
that generalisations may be made over the gender or inflection class of a noun (cf. e.g. Conzett 2006). 
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Heine (2003: 583), semantic reduction is the central factor behind grammaticalisation. It 
is helpful to think of semantic reduction in terms of reduction of uncertainty (entropy). 
The less surprising X is, the less is its information value. Consider now the examples in 
(13): 


(13) Pronoun and determiner in use 


a. Bilen star framfor huset. Den er faktisk 
Car.DEF.SG.{M} is (lit. stands) in front of house.pDEF.sc.{n}. It.m is actually 
rosa. 
pink. 

"Ihe car is in front of the house. It — i.e. the car - is actually pink’ 


b. Bilen star framfor huset. Det er faktisk 
Car.DEF.SG.{M} is (lit. stands) in front of house.pDEF.sc.{n}. It.N is actually 
rosa. 
pink. 

"Ihe car is in front of the house. It - i.e. the house - is actually pink? 


c. Den  bilen som star framfor  huset, er 
The.{m} car.DEF.SG.M that is (lit. stands) in front of house.DEF.sG.{n} is 
faktisk rosa. 
actually pink 


Recall from Section 2.4.1 that Norwegian pronouns can typically also be used as deter- 
miners. In (13a, 13b), den contrasts with det. In (13c), den does not contrast with det, since 
“det bilen is ungrammatical. In other words, the first den tells us the speaker is talking 
about the car, the last den merely tells us that a masculine or feminine will follow (and 
that it is a definite, specific example). Thus, the information value of den is higher when 
used pronominally than when used determinatively. Another argument in the same di- 
rection would be that the first (personal pronoun) den can be stressed, but the last (deter- 
miner) den cannot. This indicates that in general, the attributive determiner has a lower 
information value than the personal pronouns. The suffix has an even lower information 
value than the determiner (cf. Dahl 2015: 123). (Recall that the suffix is also even more 
‘bonded’, which is one of Lehmann’s 2015: 131 parameters for grammaticalisation.) 


3.2.3 Objection II: Agreement between parts of words? 


Patching suffixes on to the Agreement Hierarchy may seem a bad idea on theoretical 
grounds; this might at first glance seem tantamount to denying the claim that syntax is 
morphology-free (Zwicky 1992, Corbett 2014: 38f). This is a large issue which cannot be 
discussed in detail here, but the lexeme, the line between syntax and morphology, has 
not been handed down on tablets of stone; there are ‘troubles with lexemes’, as argued 
by Fradin & Kerleroux (2003), Haspelmath (2011) and many others. A very influential 
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adherent of lexeme-based models, Matthews (1991: 100), even says “it is often the mark 
of a genuine unit, like the lexeme, that we have trouble with iw 

There has been some debate over whether the Norwegian definite singular suffix 
should be taken as a marker of gender or of inflection class (cf. 2.1), and this also re- 
lates to the problem of the delimitation morphology-syntax. Afarli & Lohndal (2015) 
argue that the suffix -a should count as a marker of gender (and not ‘only’ of inflection 
class), also in the recent Oslo system described in example 2. Afarli & Lohndal are not 
worried about violating lexicalist doctrines, and that is surely fair enough, given their 
theoretical stand; yet it remains too open, in my view, what the consequences will be: 
many things normally not included as ‘gender’ will then have to fall under that label 
(many inflection classes, for instance). From the opposite side of the spectrum, Ledrup 
(2011) squarely rejects analysing -a as a gender marker, as it is not an ‘associated word’. 
An in-between course is suggested by Enger (2004a), who discusses a system like that 
in example (1): 


If genders are defined only on the basis of word-external agreement, it seems du- 
bious to treat the definite singular suffix as an exponent of gender. However, one 
may wonder if there is any reason for speakers not to consider the definite singular 
suffix a gender marker, given that the correlation with gender is perfect. In other 
words, it seems perverse to deny that the definite singular suffix is an exponent of 
gender, when there is one and only one definite singular suffix associated with 
each gender [emphasis added here]. [...] even if what determines gender contrasts 
is what patterns show up on the target (and not on the controller), affix contrasts 
that show up on the controller and that correspond to gender contrasts on targets 
have to be considered markers of gender as well. (Enger 2004a: 65) 


This means taking the definite sg. suffix as an exponent of gender in the classical 
Oslo dialect (1), but not in the present-day one (2), since the suffix did correlate with 
gender then, but does not do so now. A possible defence of taking some suffixes into 
consideration is that agreement evidence is less salient; considering agreement evidence 
requires more subtle reasoning (cf. also Carstairs-McCarthy 1994: 766).!* There is inter- 
esting psycholinguistic evidence that Norwegian children acquire the suffixes for the 
definite singular much earlier than the gender in agreeing words (e.g. Westergaard & 
Rodina 2015, 2016) . 

However, once the Agreement Hierarchy is seen as a product of other factors, it may 
become a bit less pressing whether, say, in an example such as gutten min ‘boy.DEF.sG{M} 
my.M’, the relation between gutt ‘boy’ and min ‘my’ and that between gutt and -en should 
both be subsumed under ‘agreement’. Corbett (e.g. 2006) has presented strong arguments 


13 Maiden (2016d) argues, on the basis of an impressive set of data taken from dialects and diachrony, that 
Romanian “nouns showing genus alternans are not a class defined by the agreement behaviour of associ- 
ated words, but a class the agreement behaviour of whose associated words is dictated by inflexional 
morphology [boldface mine, HOE]”. The implications are intriguing. Yet Maiden’s analysis has also been 
criticised (by Loporcaro 2016). Anyway, the subject of ‘morphology-free syntax’ is too large for this paper. 

Wurzel (1986) even suggested that, in general, exponents on the word itself should count. 
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in favour of including pronouns under the label of agreement: There are important sim- 
ilarities between pronouns and other elements in the hierarchy, so that drawing a line 
at any one specific point at the hierarchy will entail an arbitrary choice and the loss 
of worthwhile generalisations. By the same token, I suggest there are some worthwhile 
generalisations to be made by including some suffixes under the scope of the Agreement 
Hierarchy. Theories should be about opening doors, not about closing them. The only 
reason not to include these suffixes would be substantial empirical evidence showing 
that they behave very differently from the predictions of the hierarchy. 

In gutten min, both min and -en convey information about gutt. The notion of ‘intra- 
morphological meaning’ can be useful and productive here (e.g. Carstairs-McCarthy 
1994, Maiden 2005, Enger 2004a); the notion that an element of a word may ‘signal’ say, a 
particular property of the stem. In (1), -a has intra-morphological meaning, signalling the 
noun’s inflection class and its gender. This does not mean that -a is an ‘associated word’, 
only that it gives information about gender. In (2), -a also carries intra-morphological 
meaning, but now signalling inflection class only, because there is now no gender agree- 
ment related to it. 


4 The danger of drawing too sharp lines 


4.1 Automatisation 


Lehmann (1982) drew a sharp line between NP-internal and NP-external agreement. One 
of Corbett’s (2006) arguments against this is that there can be referential/semantic agree- 
ment also inside the NP, and Danish det vodka and Norwegian ei laerer (cf. Section 2.4) 
support Corbett's view. Perhaps paradoxically, if Lehmann is right in arguing that agree- 
ment has to do with grammaticalisation (cf. Section 3.2.1), then it is to be expected that 
Corbett should be right in not drawing a sharp line. Grammaticalisation tends to be a 
gradual affair; I see no reason why it should come to a complete halt exactly at the NP. 
As noted, a development from (feminine) gender to inflection class may be described 
as grammaticalisation (cf. Section 2). Grammaticalisation may in turn be related to au- 
tomatisation, according to Lehmann (2016). He sees inflectional classes as more ‘au- 
tomatised' than genders, and he says one almost has to be a linguist to wilfully produce 
the wrong allophone of a phoneme or to choose the wrong inflectional suffix. Pronom- 
inal gender is at the other end of the spectrum. It is for pronouns that there is most 
‘leeway’. They are the least 'automatised'. This perspective fits the one adopted here. 
However, under certain circumstances, even inflection class suffixes can be manipu- 
lated consciously, and not only by linguists. When looking for examples like ei laerer 
(Section 2.4.2, Enger 2015), I found (in a net forum for ‘nurse jokes’) ei set sykepleier ‘a.F 


Thanks to Florian Dolberg for pointing this out to me. 

There are many suggestions in the literature that are similar to that of Lehmann. Boye & Harder (2012) re- 
late grammaticalisation to ‘backgrounding’; automatisation and backgrounding are related. Bybee (2003) 
relates grammaticalisation to ‘chunking’; her explanation of this concept makes it quite clear that automa- 
tion is relevant here too. Haiman (1994) links grammaticalisation to ritualization and repetition. Lehmann 
(2016) does not address the relation between his suggestion and these others. 
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cute.MF nurse’. Now, in Norwegian Bokmal, en sgt sykepleier ‘a.m cute.MF nurse’, with 
masculine determiner en, is the only conventional choice. In writing ei søt sykepleier, the 
author emphasises that the nurse is a woman. Another author on the same net forum 
reacted to the wording in an interesting way. Rather than criticise the choice of ei di- 
rectly, he lists a part of the paradigm, the way it is taught to school-children, and then 
comments (my translation and editing) in (14): 


(14) ei sykepleier, sykepleiera? 


‘Where did you learn your Norwegian?’ 


This is an argument ad absurdum: if you say A (ei sykepleier), then B (sykepleiera) 
follows, and given that B (sykepleiera) is absurd, A (ei sykepleier) must be rejected. For 
present purposes, the point of interest is B: Using the old feminine suffix is apparently 
even worse than the use of feminine determiner. In short, even if the suffix is extremely 
automatised, it can be manipulated and changed. 


4.2 Pronouns 
4.2.1 A problem for the present approach? 


Lehmann (1982, 2016) is not the only linguist who has wished to draw a sharp line be- 
tween NP-internal agreement and pronominal agreement. So far, pronouns have been 
kept out of the picture, but they are worth including. In the Oslo dialect today, there are 
four pronouns. Consider (15). 


(15) Pronouns in the current Oslo dialect 

gutten.M (the boy) — han ‘he’ 

. jenta.{?} (the girl) - hun ‘she’ 

làven.M (the barn) / jakka.{?} (the jacket) - den 'it.NoN-NEuT 
d. barnet.n (the child) — det ‘it.NEUT’ 


Top 


© 


The choice of pronoun relates to animacy. The pronouns han, hun are used with 
animates (males and females respectively), den, det with non-animates (den with non- 
neuters, det with neuters). Animacy does not generally play a role for gender agreement 
inside the NP in Scandinavian (though cf. Enger 2013: 286-289). Pronoun agreement and 
noun-phrase-internal agreement thus follow partly different rules in this system, as in 
Danish and Swedish. Therefore, some conclude that pronouns are not subject to gender 
agreement (e.g. Josefsson 2009, 2014a). An alternative view is that pronouns should be 
included under gender (e.g. Corbett 2006, Enger 2013, Dolberg 2014, Haugen & Enger 
2014, Van Epps & Carling 2017). 

Once pronouns are taken into account, it may seem that the modified Agreement Hi- 
erarchy gets into trouble: It might seem as if the feminine in Oslo now is retained in 
the very extremes of the hierarchy, viz. the pronominal peg and the suffix peg, and not 
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in-between. On closer inspection, however, this is not so. As noted, the Agreement Hier- 
archy predicts that a new gender system, if semantically based, will start from the right 
end of the hierarchy and the old system will stay on the longest at the very left end. The 
word hun in (13) indicates a human - or a higher animal - of female sex. That is not 
the intra-morphological meaning of -a (cf. Section 3.2.3). While the intra-morphological 
meaning of -a can be roughly given as 'the stem to my left belongs to a particular in- 
flection class, including words as jakke ‘jacket’ and many others’, the meaning of hun is 


roughly ‘the noun to my left denotes a person of female sex. 


4.2.2 A problem for another approach 


In their Swedish grammar, Holmes & Hinchliffe (2013: 4) say that “Nouns ending in -a 
[in the indefinite sg., thus ending in -an in the definite sg., HOE] which denote animals 
are often treated as feminine irrespective of their true gender [i.e. biological sex, HOE]: 
rattan — hon the rat — she, åsnan — hon the donkey - she”. 

This observation is interesting, as it represents a problem for an important approach 
to Scandinavian gender. According to Josefsson (2009: 40, 20142), lexical gender, which 
is found within the DP, does not carry any meaning. By contrast, gender is a meaningful 
category in the pronominal domain. Thus, Josefsson's approach implies a sharp bound- 
ary between pronominal agreement, which is meaningful, and DP-internal agreement, 
which is not. However, if we wish to explain why Swedish råttan ‘the rat’ and åsnan ‘the 
donkey’ are more often referred to with hon than, say, musen ‘the mouse’ and hästen 
‘the horse’, we are stuck with the fact that the former end in —a in the indefinite singular 
[rátta, ásna], the latter do not [mus, häst]. Yet ‘ending in an -a in the indefinite singular’ 
is hardly a meaningful property. (See Haugen & Enger forthcoming, for a summary of 
other arguments against Josefsson's approach, and further references.) 


5 Conclusions 


Ihave pointed out a parallel between Oslo Norwegian and Istro-Romanian. In both cases, 
the ‘last redoubt’ of the old feminine is a suffix on the noun. The parallel is not coinci- 
dental; there are other Scandinavian examples (cf. Section 2.4) indicating that the noun's 
suffix is more ‘resistant’ towards change than are ‘associated words’. The difference can 
relate to a somewhat modified version of the Agreement Hierarchy (Corbett 1979, 2006, 
Kópcke et al. 2010), in which an extra ‘peg’ is added for the suffix. This modification is in 
line with the spirit of Fradin & Kerleroux (2003); they also note ‘troubles with lexemes’, 
but they do not use those problems as arguments against the lexeme as such. Rather 
than getting stuck in such problems, we may, for example, utilise the handy concept of 
intra-morphological meaning (Section 3.2.3). Following Lehmann (1982), I have argued 
that relating the Agreement Hierarchy to grammaticalisation may be useful, at least for 
some purposes. 


The example also illustrates ‘semantic reduction’, cf. Section 3.2.2. 
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Chapter 11 


The Haitian Creole copula and types of 
predication: A Word-and-Pattern 


account 


Alain Kihm 
CNRS, Université Paris-Diderot 


Haitian Creole is a French-based creole language spoken by about 10 millions people in 
Haiti. In Haitian Creole the copula consists in the two forms se and ye and it may not be 
expressed. The present paper argues that, despite claims to the contrary, the Haitian Creole 
copula is a verbal lexeme realized through two overt suppletive stems and a phonologically 
null stem. Selecting one stem or the other does not depend on inherent and/or contextual 
inflectional features as in English am vs. is vs. was vs. were, but on the syntax and semantics 
of the predicate headed by the copula lexeme. 


1 Introduction 


In Haitian Creole (HC), a French-based creole spoken by about 10 millions people in 
Haiti, the copula is expressed via two overt forms se and ye and it may also not be ex- 
pressed. Various studies, most of them couched in syntactic transformational terms, have 
been devoted to this variation (Valdman 1978, Damoiseau 1985, DeGraff 1992, Kihm 1993, 
Déprez & Vinet 1997, Déprez 2003). The main debate centred around the issue of whether 
the two overt forms are verbs (e.g. Valdman 1978, Kihm 1993) or pronouns (DeGraff 1992) 
or both (Déprez 2003). 

Here I will try to support the four following assumptions: (i) the Haitian Creole copula 
is a verb throughout; (ii) the two overt forms are word forms in the sense of Matthews 
(1972), realizing alternative suppletive stems of the copular lexeme; (iii) the lexeme also 
includes a null stem, devoid of phonological substance; (iv) selecting one stem or the 
other (including the null stem) does not depend on inherent and/or contextual inflec- 
tional features as is often the case (cf. English am vs. is vs. was vs. were, go vs. went), but 
on the syntax and semantics of the predicate headed by a given form of the lexeme. 

The Haitian Creole stem alternation thus differs not only from the English instances 
just mentioned, but also from cases where the phonological shape of an item merely 
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depends on the syntactic environment, i.e. on what the item appears next to. Zwicky 
(1985, 1990) gives several examples, such as the French singular possessive determiners 
which take on the masculine form when preceding a feminine item beginning with a 
vowel: e.g. mon ombrelle ‘my sunshade’, not “ma ombrelle (cf. une ombrelle ‘a sunshade’). 
Yet, as argued by Zwicky (1985), it wouldn’t make sense to assume that the gender feature 
common to both components of the NP [mon ombrelle] is not the same as in e.g. ma 
maison ‘my house’. What is in fact needed to account for such an apparent mismatch is 
a rule of referral stipulating that the shape — but not the content — of feminine singular 
possessive determiners is identical to that of masculine singular possessive determiners 
just in the case that the adjacent word begins with a vowel. (For rules of referral also see 
Stump 2001: 36-37) And note that the adjacent word need not be the head noun: cf. mon 
ancienne maison ‘my old house’. 

In Haitian Creole, in contrast, inserting se or ye or nothing audible depends not on 
the shape of what follows, but it is related to the lexical category of the complement 
to some extent and, more importantly, to the semantics of the predication type. The 
ser/estar alternation in Portuguese and Spanish may provide an analogue (Mateus et al. 
1989: 98-102), except for the fact that ser and estar are likelier to represent two distinct 
lexemes than distinct stems of the same lexeme as in Haitian Creole. In the latter, as we 
shall see, the equivalent of the ser/estar contrast is the se vs. nothing contrast. Now, it 
is not detrimental to parsimony to assume a null stem of a given lexeme, provided it 
belongs to a paradigm whose other members are all overt forms, so that the content of 
the null form can be unambiguously retrieved thanks to contrast with the overt forms’ 
contents (see Sag et al. 2003 on the copula in African-American Vernacular English). 
Lexemes devoid of phonological realization would be much harder to justify, in contrast. 
Moreover the conditions on ye’s insertion find no equivalent in the ser/estar alternation, 
while supporting the suppletive stem hypothesis. 

What I am proposing, therefore, is a fully lexicalist account which accounts for most 
of the facts and avoids the unnecessary complexities and implausible assumptions of the 
previous syntactic treatments. First I review the facts. Then I show how these facts can 
be accounted for by assuming one copular lexeme, the lexical entry of which mentions 
several stems, each of which identifies a particular lexical entry of type word, whose 
valence and semantics are subsets of the valence and semantics of the lexeme. Colloca- 
tions of these words with tense-mode-aspect (TMA) markers are realized via realization 
rules written in an Information-based Morphology (IbM) format (Crysmann & Bonami 
2015).In the conclusion, I point out what remains, to my mind, in need of an account and 
I suggest some lines of research that might lead to a fuller understanding of the Haitian 
Creole copula, especially from a diachronic viewpoint. 


2 The facts of the HC copula 


Part of the Haitian Creole copula’s paradigm can be retrieved from the following exam- 
ples (Déprez 2003: 135, 136, 139; Fattier 2013: 201) : 
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(1) Jan se yon pwofesè. 
John cop INDF teacher 


‘John is a teacher? 
(2) Jan chapantye. 

John carpenter 

‘John is a carpenter? 


(3) Jan malad. 
John sick 


‘John is sick: 


(4) Jan nanlekol la. 
John in school DEF 


‘John is at school? 


(5) Elifétte anba tab la. 
E. PST under table DEF 


‘Elifèt was under the table: 


(6) Se fré mwen Jan ye. 
cop brother 15G John cop 


‘It is my brother that John is: 


As mentioned above, three forms come out from these examples:, (i) se in (1) and (6), 
obviously from French c'est /se/ ‘it is’; (ii) the null form in (2)-(5); (iii) ye in (6), from 
French est /e/ ‘is’ or i(l) est /je/ ‘he is’. 

Let us first compare (1), where the copula is realized as se, with (2) where it is not 
realized at all. The difference seems to lie in the syntactic category of the complement, 
an NP in (1) and a NOM in (2) (Sag & Wasow 1999: 84). And note that chapantyé in (2) can 
be modified by an attributive adjective: e.g. Jan bon chapantyé ‘John is a good carpenter’. 

The crucial difference, however, actually resides in the individual-level (permanent, 
identificational) character of the property predicated by means of se, in the present case 
being a professor (Carlson 1977, Diesing 1988, Chierchia 1995, Kratzer 1995). Se's comple- 
ments need not be indefinite NPs involving the indefinite determiner yon ‘a’ as in (1). 
Whenever the complement denotes some obviously permanent quality of the subject, de- 
termination can be dispensed with. See for instance the following extract from a poem 
by Bonel Auguste (Chalmers et al. 2015: 20), where being man's limit is presented as a 
defining property of man's dream: 


(7) Rèv lòm se limit lom. 
dream man cop limit man 


"Man's dream is man's limit. (Le rêve de l'homme est la limite de l'homme) 


Despite the absence ofthe definite articles one sees in the French translation, limit lóm 
is a definite NP in (7) by virtue of being a genitive construction whose complement lòm is 
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itself definite as it refers to the maximal set of human beings (see Lyons 1999:181-184 on 
“class generics”; Huddleston & Pullum 2002:407; Kihm 2003). Bare nouns (i.e. NOMs) are 
also acceptable under the same conditions as in Mari se fanm ‘Mary is a woman’ (Glaude 
2012), alternating with the almost synonymous Mari se yon fanm. In French as well, in a 
somewhat literary register, Marie est femme is an acceptable alternative to Marie est une 
femme. 

Given this, (2) appears to be ambiguous, in the sense that being a carpenter may be 
viewed as a permanent, individual-level quality of John, or as just a stage-level descrip- 
tion of what John is at the time the sentence is uttered. Nouns denoting professions or 
trades typically trigger that kind of ambiguity, always allowing for referentially equiv- 
alent predicates with or without se. (For similar facts in French, see Kupferman 1979, 
Boone 1987) 

The individual- vs. stage-level contrast can also be made manifest in adjective predi- 
cates. Contrary to the received idea that Haitian Creole adjectives are in fact stative verbs 
that never need a copula, Damoiseau (1996) demonstrates on the basis of a corpus study 
that for more than half of the items (including malad) adjective predicates without an 
overt copula as in (3) imply a stage-level interpretation, while the same with se as in Jan 
se malad are understood as predicating an individual-level property of the subject (also 
see Pompilius 1976). This is patently shown by the distinct clefting strategies implied by 
either possibility. Clefting stage-level predications (no overt copula) is done by way of 
“doubling” as in (8) (Déprez 2003: 146): 


(8) Se damouJan damou. 
cop in.love John in.love 


‘John ts in love. 


Compare Se manje Jan manje {cop eat J. eat} ‘John did eat’. Clefted individual-level 
predications (involving se), in contrast, are like (6). See (9) (Damoiseau 1996: 157): 


(9) Se grangou li ye. 
cop unscrupulous 3sG COP 


‘S/he rs unscrupulous. 


Interestingly grangou also has the stage-level meaning ‘hungry’, in which case clefting 
employs the same strategy as for damou ‘in love’ in (8): Se grangou Jan grangou ‘John Is 
hungry’. 

Example (4) shows the copula is not realized when the complement is a PP. However, 
not all PP complements behave alike: PP complements, locative or not, predicating a 
potentially permanent situation require se as shown in (10) and (11) (Déprez 2003: 141- 
142): 


(10) Toutsa se pouou. 
all this cop for 2sc 


‘All this is for you? 
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(11) M pa te di ou vi mwense nan navigasyon. 
1SG NEG PST tell 2sc life ısG copin navigation 


‘I did not tell you my life is in navigation: 


The descriptive generalization therefore seems to be that the copula is realized as se 
before a noun, adjective or prepositional phrase denoting a potentially individual-level 
property of the subject, while it has no exponent when the denoted property is poten- 
tially stage-level. I hedge this statement with "potentially" because it seems to be rare 
that being viewed as a stage or individual-level property does not to some extent depend 
on the intentionality of the speaker rather than being entirely anchored in the ontology 
of the property itself. 

In (5) one might wonder whether te is not actually the past form of the copula. Two 
considerations oppose this supposition. First, complementary data show te to be a past 
tense marker (a ‘particlexeme’ in Zwicky's 1990 terminology) that may combine with 
other undisputable TMA markers. See the following examples from Fattier (2013: 199, 
201): 


(12 Li te gen twa zoranj. 
3sG PST have three orange 


'S/he had three oranges? 


(13) Li t(e)ap  boukanen mayi. 
3SG PST PROG roast maize 


‘S/he was roasting maize: 


Yet, there still might exist two homophonous te, one a past marker, the other the 
copula’s past form. Actually, such an assumption would have history on its side, since 
te obviously comes from the French imperfect était ‘was’ and/or the past participle été 
‘been’ and the TMA sequence in (13) can be traced back to the obsolete and/or dialectal 
French past progressive periphrase était aprés or (a) été aprés. 

Synchronically, however, there is good reason not to regard te as the past copula, 
namely that transposing (6) into the past gives us Se fré mwen Jan te ye ‘It’s my brother 
that John was’, not “Se frè mwen Jan te, as we would expect if te was the past copula. I 
will therefore assume that the past tense marker te in (5) “precedes” (if one may say so) 
the same null form of the copula as is evidenced in (2)-(4). 

Example (6) illustrates both the use of se in clefts and the copula's third form ye. Let 
us begin with the latter. Its peculiarity is to require a gap to its immediate right. The gap, 
the foot of a long distance dependency (LDD) (Sag et al. 2003), may be part of a cleft as 
in (6) or of a WH-construction as in (14) from a poem by André Fouad (Chalmers et al. 
2015: 62): 


(14 Di m kijanlavite ye. 
tell isc how life pst cop 


‘tell me how life was. (dis-moi comment était la vie.) 
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Note it wouldn’t do to simply state that ye must be followed by nothing (meaning an 
utterance-final pause). Something may indeed occur after it, provided it is not a comple- 
ment, but rather dislocated material as in (15) (Tessonneau 1980: 18) or an adjunct as in 
(16) (Déprez 2003: 148): 


(15 Sa VP ye nèg la ki maryeavè fi a? 
what 3$G COP man DEF REL marry with girl DEF 
‘What is he, the man who married the girl?’ 


(16) Nonm nante pi granm te ye lè sa a. 
man DEF PST more big 1sG PST COP time DEM DEF 


"Ihe man was bigger than I was at that time: 


Conceivably ye's immediate follower in (16) is a gap whose filler is gran ‘big’. Note 
that ye is neutral as to the stage vs. individual-level contrast. This is expected since ye 
only occurs in clauses involving LDDs, whose neutral, declarative or noncomparative 
counterparts may involve either type of predication: e.g. the answer to (15) might be Nég 
la ki marye avé fi a se yon pwofesé "Ihe man who married the girl is a professor’, while 
a possible non-comparative counterpart of (16) would be Nonm nan gran “The man (is) 
big’. 

As mentioned, the fact that initial se in (6) lacks a subject has led some authors to 
cast doubt on its verbal character (DeGraff 1992) or to define it as an “introducer” — 
whatever that may be — distinct from copular se (see discussion in Valdman 1978). Yet, 
null subjects do exist in Haitian Creole as shown by the following two examples (Déprez 
1992a:24; Déprez 1992b:198): 


(17) Rete  yonnég nankay la. 
remain one manin house DEF 


“There remains one man in the house’ 


(18) Sanble Mari renmen Jan. 
seem Marylove John 


‘It seems that Mary loves John’ 


Such unrealized subjects correspond to expletive subjects in languages like English or 
French where nullity is disallowed: compare Il reste un homme dans la maison, Il semble 
que Marie aime Jean. But note that in 17° century French sembler and rester could be 
used without expletive il in sentences quite similar to (17) and (18) (Haase 1935: 15-16). 
The null subject of se in (6) and in such sentences as Se vre {cop true} ‘It’s true’ (French 
C’est vrai) falls under this generalization. Although se’s initial /s/ obviously originates 
in the French neutral pronoun ce of c’est ‘it is’, this is highly unlikely ever to have had 
any relevance in the fully emerged Creole — that is since the end of the 18' century — 
where se has become an unanalysable item, contrary to what I argued in Kihm (1993). I 
therefore conclude that se is a verbal copula across the board, and it belongs to the small 
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set of verbs that allow expletive null subjects, a feature to be mentioned in its lexical 
definition. 

Se presents still other properties. First, contrary to what the examples so far may 
suggest, it is not limited to third person. See (19) from a poem by Soléy (Chalmers et 
al. 2015: 22) where its subject is the clitic form m of mwen ‘I, me’, occurring with all 
verbs (cf. m pati ‘I left’): 


(19) M se espas nan mitan de pyebwa. 
19G COP space in middle two tree 


‘Iam the space between two trees: (je suis l'espace entre deux arbres) 


And see (16), which shows that ye, like se, is compatible with all person-number values. 

An intriguing property of se is its position vis-à-vis TMA markers and the negator, as 
illustrated in the three following examples (Glaude 2012: 39; Valdman 1978: 240; Cavé in 
Chalmers et al. 2015: 46): 


(20) Jan se pa te papa w. 
John cop NEG pst father 25G 


‘John wasn’t your father: 

(21) Sa se va yon gwo nouvèl. 
that COP FUT INDF great news 
‘That will be great news: 

(22) Se tap yon tan pèdi. 
COP PST.PROG INDF time lose 


‘It would be time lost? (Ce serait une perte de temps) 


As shown by (20) the grammatical order is se « NEG « TMA, whereas it is NEG < 
TMA « V with all other verbs, including ye (cf. 14). Examples (20)-(22) suggest that all 
simple or complex TMA markers are admissible with se. However, not all native speakers 
accept se va and se ap.! 

Another peculiarity of se is that the possibility of its being preceded by all subject 
pronouns gets drastically reduced whenever it combines with TMA markers and/or the 
negation. The pronoun is then obligatorily 3sc, it is left-dislocated and only the emphatic 
form li-mém may be used. See the following contrast (Déprez 2003: 151): 


(23) “Li se te zanmi mwen. 
3SG COP pst friend ec 


Intended: 'S/he was my friend? 


(24)  Li-mèm,se te zanmi mwen. 
3sc-self cop pst friend 1sc 


‘S/he was my friend. 


1T am grateful to Jean Noël Whig for these judgments. 
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The same ungrammaticality affects “Li se pa zanmi mwen contrasting with Li-mém, 
se pa zanmi mwen ‘S/he isn’t my friend’ and “Ou(-mém) se (pa) te zanmi mwen, whose 
grammatical alternative is Ou (pa) te zanmi mwen ‘You were (not) my friend’, using the 
null form of the copula. In (24) the subject of se is therefore the null subject bearing 3sc 
as its only possible value. 

Déprez (2003: 151) relates the ungrammaticality of *Ou(-mém) se... to that of French 
*Toi, c’est/c’était mon ami next to Elle/lui, c'est/c'était mon ami(e). There certainly is truth 
in this parallel. Yet it does not account for the well-formedness of Ou se zanmi mwen ‘You 
are my friend’ or Jan se zanmi mwen ‘John is my friend’. In fact, it seems to be a true 
generalization that se modified by TMA markers and/or the negation only selects for the 
null subject, so that Jan in (20) is actually left-dislocated as is li-mém in (24) and as is 
Jean in the French equivalent Jean, c'est/c'était mon ami. This — as it is not so obvious 
as with pronouns — has to be checked with careful prosodic analyses. 

Another noteworthy fact is the neutralization ofthe stage- vs. individual-level contrast 
with non-third person subjects and inflected se, since Ou (pa) te malad ‘You were (not) 
sick’ is the only negative and/or past counterpart of the positive present contrasting pair 
Ou malad ‘You're sick’ and Ou se malad ‘You're a sick person’. 

Finally, it is worthwhile noting that se may be elided as s' before yon 'a' yielding the 
portmanteau /s3/. See the following lines by Solèy (Chalmers et al. 2015: 22): 


(25) Labote/s' on  zwazo benyen an san. 
beauty copinpF bird bath in blood 


"beauty / is a bird bathed in blood? (la beauté / est un oiseau ensanglanté) 


This confirms, if need be, that se is unanalysable as a single word despite its etymology. 
As for the null form, it is compatible with all TMA markers and the negator, as shown 
by (5) as well as by (26) (Glaude 2012: 49) and (27) (DeGraff 2007: 114): 


(26) Jan ap doktè. 
John proc doctor 


‘John will be a doctor. 
(27 Duvalye pa prezidan Ayiti. 
Duvallier NEG president Haiti 


‘Duvallier isn't the president of Haiti? 


As Glaude points out, (26) cannot mean ‘John is being a doctor’, quite normally in fact: 
interpreting the progressive as a future is a general possibility, and the only one with 
stative verbs (Fattier 2013). The positive counterpart of (27) is Duvalye prezidan Ayiti 
‘Duvallier is the president of Haiti’, whereas the negative of the also acceptable Duvalye 
se prezidan Ayiti is Duvalye, se pa prezidan Ayiti (see above). 
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3 A formal account of the Haitian Creole copula 


In this section I will only try to account for the clearest facts as exemplified in (1)-(6). 
What I leave aside for future research will be set out in the conclusion. 

As stated in the introduction, I assume the Haitian Creole copula to be one verbal 
lexeme realized as three stems, one null, selected according to predication type. This 
lexeme can be represented as the lexical entry below: 


copv-lexm 
LID cop 


HEAD [PRED +] 


SPR 1)<NP | null 


SYN 


VAL  |cowPs [NP |NOM |PP | ADJP | ADV | gap) 


+2 


> 
E 
Q 
n 
4 
M 


FORM |STEM Al [BI [C 


MODE prop 


INDEX S$ 
RIN co 
SEM P 
SIT s 
RESTR ; 
SBJ i 


PRED j pred stlev| indlev 


That is to say, the Haitian Creole copula is a predicator whose valence includes (i) a 
specifier that is a possibly unrealized NP; (ii) a complement that may be an NP, a NOM, 
a PP, an adjective phrase, an adverb (e.g. Se konsa ‘It’s so’), or a gap. Recall that NOM is 
the label for noun phrases unspecified for (in)definiteness, such as chapantyé in (2). 

Let me also point out that Haitian Creole personal pronouns are best analysed as mem- 
bers of the NP category. There seems to be no good reason, in particular, to view their 
reduced forms (see Table 1) as anything but phonological clitics, since (i) reduced and 
unreduced forms alternate without change of meaning; (ii) sequences of reduced forms 
and TMA markers or verbs do not give rise to any particular phonological phenomena 
as is the case with English contracted auxiliaries (Bender & Sag 2000). For instance, 3sc 
li may but need not reduce to / when preceding a vowel-initial verb or TMA marker, e.g. 
lap chante ~ li ap chante 's/he/it is singing’ (but li /*l chante ‘s/he/it sang’); similarly in 
object position following a vowel-final verb, e.g. yo wè li ~ yo wè l ‘they saw her/him/it’ 
(but yo bat li/*I ‘they struck her/him/it’). The crucial factors seem to be register and speed 
of delivery. 

Expressions headed by the copula are propositions about some situations and they 
are semantically restricted to predicating stage-level (stlev) or individual-level (indlev) 
properties of a given subject. This has to be specified, since it conditions the choice of 
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Table 1: Haitian Creole personal pronouns 


sg pl 
1 mwen/m nou/n 
2 ou/w nou/n 
3 lil yo/y 


the proper stem among the three stems that realize the copula, tagged A (the null stem), 
B (se), and C (ye) according to degrees of nondefaultness. 
The syntactic environment calling for the null stem (A) is summed up in (29): 


(29) Jan (pa) (te) (bon) chapantyè / malad (anpil) / nan lekol la /konsa. 
John (NEG) (esr) (good) carpenter sick (very) in school DEF so 


‘John is/was (not) a (good) carpenter/(very) sick/at school/so: 


That is to say, the copula's null stem is required if (i) the subject is an NP; (ii) the 
complement is a NOM, or an ADJP, or a PP, or an adverb; (iii) the denoted property is 
viewed as being transitory, that is of the stage-level sort. Whatever the complement, the 
copula may be negated and/or specified for some TMA value. 

The question now is to relate the copula's stems to the syntactic and semantic prop- 
erties calling for one or the other. Since (28) describes the lexeme labelled cop, each of 
the stems may be viewed as realizing a word-form of the lexeme, each word-form with 
its own lexical entry. The A stem is thus assigned the following lexical entry, where the 
phonological form is represented as the empty list, and the valence and semantics are 
subsets of the lexeme's valence and semantics: 


verb word 
LID cop 
PHON <% 


HEAD [PRED +] 


SPR UND: 
SYN 
VAL comps [2](NOM | PP | ADJP | ADV) 
(30) ARG-sT TA 
MODE prop 
INDEX S 
HN RIN cop 
SIT S 
RESTR : 
SBJ i 


PRED j pred stlev 
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Suppose now we want to account for the predicate te bon chapantyé ‘was a good car- 
penter’ (French était bon charpentier). Following Bonami (2015), I assume Haitian Creole 
collocations such as te chante ‘sang, used to sing’ to be periphrases, that is multiword 
morphological units involving an ancillary and a main element, in which the former is 
a marker instead of a verb as in the English periphrase has sung. (See Van Eynde 1994 
and Sag 2012 for the relevant notion of marker as a non-head element selecting a head 
and assigning it features.) The only difference between te chante and the case at hand 
is that the main verb’s stem has no phonology associated with it. Hence the following 
realization rule for the collocation of the past marker te with the null stem of the copula, 
using Information-based Morphology formalism (Crysmann & Bonami 2015): 


mword 
PHON <te) 
PH d PH <)> 
MPH ,[2 > 
PC 1 PC 1 
MS 3] [TMA pst], [A] [LID cop]? 
MUD [3] [TMA pst] 
(31) PH " 
RR1 MPH 
PC 1 
RS [] 
MUD [A] [LD cop] 
PH H 
RR2 MPH 
PC 1 
RS  [] 


Rule (31) realizes a multiword (mword) comprising the marker te and the null copula 
tagged A pointing to the relevant word-form and stem. Owing to this tagging we ensure 
that /te < »/ will be inserted in the right syntactic and semantic contexts. 

Note the reverse selection (RS) feature is given no value in (31). The function of this 
feature is to ensure that, in periphrases such as has sung, the main verb's form (e.g. the 
past participle) stands in the context of the ancillary item that requires it (e.g. have). 
In Haitian Creole, however, the form of the main verb never depends on the marker in 
collocation with which it assumes a given TMA value. Being a word, on the other hand, 
te includes a COMPS feature [VFORM finite] in its lexical entry. 

In the morphophonological (MPH) tier of the rule, the phonological (PH) form <te) 
and the null stem are assigned the same position class (PC) 1. This is in order to avoid 
the awkward statement that te “precedes” something that is actually not there. From 
a morphophonological viewpoint, we may therefore consider te in te bon chapantyé a 
portmanteau word amalgamating the marker and the null stem, somewhat similar to 


French du for (de le). 
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Rule (31) will also account — mutatis mutandis — for the collocations ap < ) and pa < } 
of (24) and (25). 

Let us now tackle se. The syntactic environments calling for it are not so easy to sum 
up in one example. At least three are necessary, discounting for the moment the issue of 
the position of TMA markers and the negator: 


(32) Mari se yon (bon) profesè / fanm /sé ou / malad. 
Mary cor INDF (good) teacher woman sister 25G sick 


"Mary is a (good) teacher / a woman/ your sister / a sick person’ 


(33) Se vre /konsa/ yon lót bagay. 
COP true so INDF other thing 


‘It’s true / so / another thing’ 


(34) Vi mwen senan navigasyon. 
life isc cop in navigation 


'My life is in navigation: 


Se is thus shown to be required when (i) the subject is an NP as in (32) and (34) or 
is null as in (33); (ii) the complement is an NP as in (32), or a NOM whose head clearly 
denotes some permanent quality such as being a woman, or an adjective phrase denoting 
an individual-level property as in (32) and (33), or a PP with the same type of denotation 
as in (34), or an adverb such as konsa in (33). Owing to questions about its valence, I 
leave aside se in clefts such as (6), although I'm confident it can be shown to represent 
the same lexeme as se in the other contexts. The lexical entry for the se word-form of the 
copula is therefore (35): 


verb word 
LID B| cop 
PHON (se 


HEAD [PRED +] 


SPR 1|<NP | NOM) 
SYN 
VAL [comers [NOM | PP | ADJP | ADV) 
(35) ARG-ST DS 
MODE prop 
INDEX S 
geng RIN cop 
SIT s 
RESTR j 
SB] i 


PRED j pred indlev 
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I assume the present tense reference of se in examples (32)-(34) is a corollary of its 
not being modified by any TMA marker, so that there is no question of a “zero” marker. 
Hence the following realization rule for se in, for instance, (32) with yon chapantyé as a 
complement: 


mword 
PHON se) 
PH (se) 
MPH 
PC 1 
MS 2] [TMa prs], [B] [Lip cop]) 
MUD [2] [rma prs] 
(36) PH (se) 
RR1 MPH [1 
PC 1 
RS [] 
MUD [B] [tip cop] 
PH (se) 
RR2 MPH 
PC 1 
RS [] 


In accordance with the “paradigmatic” view of TMA retrieval, [TMA prs] and the 
stem's realization are assigned the same phonology and position class. 

What about the position of TMA markers and the negator as illustrated in (20)-(22)? 
Considering only the sequence (se te), one would be tempted to see it as one word sete 
meaning ^was/were', which would then have to count as a fourth stem of the copula or 
as an exceptionally synthetic inflection of the second stem. There are several hitches to 
that solution. First, one would have to deal with the fact that this putative word could 
be broken up by the negator pa, as one sees in (20). Infixes do exist, yet assuming pa to 
behave as an infix just in this case will certainly be felt to be too costly. The only solution 
coherent with the sete hypothesis would then be to view as one word not only it, but 
also the sequences (se pa te) ‘was/were not’ and (se pa) ‘am/is/are not’. 

It seems to me to be simpler and less offensive to Occam's razor to posit special realiza- 
tion rules such that TMA markers and the negator — a natural class as exponents of an- 
alytic inflection including polarity — exceptionally follow rather than precede the main 
verb when it is se. As usual, the explanation for such a crazy behaviour is bound to be di- 
achronic to some extent: cf. French c'est pas /se Dal ‘it isn’t’ — but c'était pas /sete_pa/ ‘it 
wasn't, which confirms te's identity as a TMA marker and shows the cop < NEG < TMA 
ordering to be a Haitian Creole innovation consequent to te's emergence. 


269 


Alain Kihm 
Rule (37) accounts for the sequence (se pa te) of se pa te yon bon chapantye ‘wasn’t a 
good carpenter’: 


mword 
PHON  (sepate) 


PH | PH (pa) PH id 
MPH 1 WE ,[3 > 
PC 1 PC 2 PC 3 
MS 4] [pot neg], [5] [rma pst], [B] [LID cop]» 
MUD [B] [LID cop] 
PH a 
RR1 MPH u 
PC 1 
(37) RS  [] 
MUD [4] [Por neg] 
PH (pa) 
RR2 MPH [2 
PC 2 
RS [] 
MUD [5] [TMA pst] 
PH A 
RR3 MPH [3 
PC 3 
RS [ ] 
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This rule should be contrasted with the rule accounting for the “normal” order /pa te V/ 
of, e.g., pa te chante ‘didn’t sing": 


mword 
PHON <patechante) 


PH (pa) PH (re PH = 
MPH U , 2 ME > 
PC 1 PC 2 PC 3 
MS 4] [ror neg], [5] [rma pst], "CO [LID chante]? 
MUD [4] [Pot neg] 
PH (pa) 
RR1 MPH (1 
PC 1 
(38) RS  [] 
MUD [5] [TMA pst] 
PH (te 
RR32 MPH 2 Geer Ber 
RS [] 


MUD [C] [LID chante] 


PH = 
3 


RR1 MPH 


PC 3 


RS [] 


The main difference — apart from the fact that chante, like all verbs but se and raising 
verbs (see above), does not accept null subjects — lies in the respective position classes. It 
is particularly noteworthy that the mutual ordering of the negator and the TMA marker 
is fixed: pa « TMA. It is this sequence that appears as a block on the “wrong” side when 
the verb is se. 

Examples (6) Se fré mwen Jan ye ‘It’s my brother that John is’ and (14) kijan lavi te ye 
‘how was life’ suffice to illustrate the third stem’s environment: its subject must be an 
NP and its complement a gap related to clefting as in (6) or questioning as in (14). Hence 
the following lexical entry: 
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verb word 
LID C| cop 
PHON (ye 


HEAD [PRED +] 


SPR 1|(NP)» 
SYN 
VAL comps [2|<gap) 
(39) ARG-sT [i2 
MODE prop 
INDEX S 
RLN cO 
SEM P 
SIT s 
RESTR | 
SBJ i 


PRED j pred 


As mentioned above, ye is neutral as to whether the predicated property is a stage- 
or individual-level one. Its occurrance in just one environment justifies my ranking it as 
the most non-default stem. On the other hand, the mutual ranking of the null stem and 
se in terms of defaultness may be judged moot. The numbers of triggering contexts are 
the same, and I can’t see any good reason why stage-level properties should be deemed 
more default than individual-level properties. Be it as it may, since stems must be tagged 
in any event and nothing much hangs on the relative ordering of se and the null stem, I 
maintain the ranking of (28). 


4 Conclusion: What has been done and what remains to 
do 


Haitian Creole facts lie precisely at the interface of morphology and syntax, and it has 
been the aim of the present article to show how a word-based morphological model is 
especially fit to do justice to such an inherently morphosyntactic character. 

Formalizing the data as I just have done is a necessary step in understanding how 
things work. It doesn’t tell us, however, why things work the way they do, it doesn’t 
explain why things are as they are. Explanation in the real sense of the term has to come 
from outside formal grammar. In the case at hand, the likeliest source is diachrony, that 
is the sociolinguistic conditions under which Haitian Creole emerged and the nature of 
the linguistic input at the origin of this emergence. 

As to the first point, our best hypothesis is that Haitian Creole emerged between the 
1680’s and the end of the 18" century as a consequence of the massive importation of 
African slaves into Haiti, officially a French possession from 1697 to 1804 (see Holm 
1989:382-387; Faraclas et al. 2007), and that it was mainly the product of a process of 
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second language acquisition (SLA) by adults in adverse conditions, where the target lan- 
guage French could only be acquired in an unguided fashion, “on the job", and was not 
actually acquired, but only a basic variety of it (Klein & Perdue 1997), which later ex- 
panded into a full-fledged language. The Africans’ knowledge of their first languages 
(the substrate) played a role in this process, although apparently no direct one in the 
copula issue. 

Where it may have proved influential is in the fact that the stage- vs. individual-level 
contrast is active in what seems to have been Haitian Creole’s main substrate language, 
namely Fongbe (Lefebvre 1998). In Fongbe according to Ndayiragidje (1993: 63) “only 
predicates whose argument structure includes an event position — Stage-Level Predi- 
cates... may be clefted, contrary to those that do not include that position — Individual- 
Level Predicates” (my translation). This is what makes the difference between e.g. gbà 
‘to destroy’ and sé ‘to know’. In Haitian Creole as well the same difference obtains be- 
tween kraze ‘to destroy’ and konnen ‘to know’ so that (40) is grammatical, whereas (41) 
— possibly meaning ‘John does know that language’ — is not (Lefebvre 1990 — and see 


(8)-(9): 


(40) Se kraze Boukikraze kay la 
cop destroy B. destroy house DEF 


"What Bouki did to the house was destroy it” 


(41 “Se konnen Jan konnen lang sa a. 
coPknow J. know language DEM DEF 


Intended: John does know that language: 


The se vs. null form contrast therefore appears to be a special case of this overarching 
contrast permeating the whole verbal lexicon, which seems to be more central in Fongbe 
than it is in French, though it is present in the latter as well. 

Concerning the French input, on the other hand, we unsurprisingly hold no record- 
ing of the sort of 17^ century French in which the arriving slaves were addressed or 
could pick up from the native French speakers they were in generally unpleasant con- 
tact with. That it was a colonial koiné not too different from the central Parisian dialect, 
we can be reasonably sure of (Chaudenson 2004). Whether it was the full language or a 
foreigner talk reduction of it, we don't know, though there is evidence that the full lexi- 
fier languages were used in the Caribbean plantations where creole languages emerged 
(Alleyne 1980). 

What we can and must do then, is first try to account for the facts that have been 
pushed under the rug in the present work, in particular the strange behaviour of se ac- 
cording to whether it is or is not modified by TMA markers and/or the negation, and why 
is then the stage- vs. individual-level contrast neutralized. Secondly, we should look up 
17^ century French grammar, using such ressources as Haase (1935), in order to deter- 
mine as much as possible to what extent the Haitian Creole system inherits from its 
lexifier's system. For instance, although the substrate is likely to have been influential 
as suggested above, there probably is a relation between the distribution of se and the 
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null stem — requiring individual and stage-level complements respectively — and the 
distribution of c'est and il/elle est preceding a nominal complement in 17'" century as 
well as contemporary French (Kupferman 1979, Boone 1987, Zribi-Hertz to appear). All 
this, however, belongs to the to-do tray. Let’s hope it won’t linger there too long. 
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Chapter 12 


On lexical entries and lexical 
representations 


Andrew Spencer 


University of Essex 


Lexicalist models of syntax share with lexeme-and-paradigm models of morphology the as- 
sumption that the primary unit of the lexicon is the lexeme, an abstract representation of 
properties unifying a set of inflected word forms. Lexicalist syntactic models (such as Head- 
driven Phrase Structure Grammar, henceforth HPSG, and Sign-Based Construction Gram- 
mar, henceforth SBCG) distinguish modelled linguistic objects from descriptions of objects. 
A description, but not an object, can be a partial (underspecified) representation. However, 
a lexeme is by definition only partially specified, being underspecified for all those mor- 
phosyntactic properties that its word forms realize (the lexeme poc realizes neither singular 
nor plural, unlike the word forms dog, dogs). This implies that lexemes are descriptions, not 
objects, which is incompatible with assumptions about the type hierarchy for signs and the 
lexicon in HPSG/SBCG. If we relax the definition of full specification to admit lexemes as ob- 
jects then the question arises as to how many properties can be left unspecified. I argue for 
a maximally underspecified model. Even the declaration of properties for which the given 
class of lexemes inflects (the ‘morpholexical signature’, MORSIG) is underspecified to the ex- 
tent that its contents are predictable. This entails that an inflected word form of a lexeme 
can be defined only after the morsic attribute is specified. Derivation, a lexeme-to-lexeme 
mapping, can therefore be defined over the same maximally underspecified lexical represen- 
tations, whose inflection is then typically governed by a different morpholexical signature 
(e.g. when the derivation changes word class). All such specifications are given by default 
statements, which are overridden for irregular items. Verb-to-adjective transpositions (par- 
ticiples) are members of the verb’s paradigm yet inflect according to the adjectival paradigm 
(the ‘adjectival representation’ of a verb). This gives the effect of a ‘lexeme-within-a-lexeme’, 
posing a challenge for lexeme-and-paradigm models. I present an analysis in which the def- 
inition of the participle is driven by a feature REPRESENTATION. This (re-)defines the MoRSIG 
attribute, creating a representation which is identical to that of an adjective, while remain- 
ing part of the verb’s paradigm. I discuss some of the implications of this analysis for lexical 
relatedness, the lexical type hierarchy of SBCG and the morphology-syntax interface. 


Andrew Spencer. On lexical entries and lexical representations. In Olivier Bonami, 
Mil Gilles Boyé, Georgette Dal, Hélène Giraudo & Fiammetta Namer (eds.), The lexeme 


in descriptive and theoretical morphology, 277-301. Berlin: Language Science Press. 
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1 Introduction 


The notion of word is by definition central to lexicalist models of syntax, so one would 
imagine that morphology, too, would occupy a central place in the construction of such 
models. However, there is as yet surprisingly little consensus between morphologists 
and syntacticians over fundamental aspects of word structure and the relations between 
words and syntax or semantics. In addition, I will argue that there is a systematic unclar- 
ity in conceptualizations of wordhood even amongst those of us who accept the pri- 
macy of the lexeme notion and its role in morphosyntax (‘lexeme-and-paradigm’ mod- 
els). One central ontological question is ‘what kind of a thing is a word?’ The problem is 
that, whereas inflected word forms can be regarded as ‘concrete’ linguistic objects which 
combine with each other to form phrases (another type of object), lexemes are by their 
nature more abstract: they are ultimately representations which unite a set of related in- 
flected word forms without themselves being a form. They are therefore underspecified 
representations, in the sense that they are not specified for the various morphosyntactic 
properties their word forms realize. The dictionary is a set of lexemes, so it, too, is an 
abstract construct. 

The question of what lexemes are is made more acute when we examine a somewhat 
neglected, but theoretically and conceptually important, type of lexical relatedness, the 
(true) transposition, illustrated in this paper by the Russian deverbal participle. A par- 
ticiple is the adjectival ‘representation’ (Haspelmath 1996) of a verb. As such, it is part 
of the paradigm of a verb and yet it inflects exactly like an adjective and demonstrates 
much of the external syntax of an adjective (a true participle is used principally as an 
attributive modifier to a noun). Shifting morphosyntactic category in this fashion is char- 
acteristic of derivation, i.e. lexeme formation, yet a true participle (that is, a participle 
that has not undergone lexicalization, or some other process of grammaticalization) is 
not an autonomous lexeme, independent of its verb base, any more than the past tense 
or the infinitive form of a verb is an autonomous lexeme. The participle thus gives the 
appearance of being a ‘lexeme-within-a-lexeme’, posing obvious difficulties for any sim- 
ple characterization of lexeme-and-paradigm inflectional morphology, and especially to 
the inferential-realizational (I-R) class of models in Stump’s (2001) typology. 

In this paper I investigate some of these questions against the backdrop of the class of 
I-R models called Paradigm Function Morphology (PFM: Stump 2001, Bonami & Stump 
2016). Specifically, I will assume the overall architecture of a model of lexical relatedness 
proposed in Spencer (2013), Generalized Paradigm Function Morphology (GPFM). I con- 
front the proposals about lexical representations and lexical relatedness made in GPFM 
with influential proposals put forward within the variant of HPSG developed by Sag 
(2012), Sign-Based Construction Grammar (SBCG). I argue that the HPSG/SBCG concep- 
tion of the lexeme conceals important conceptual inconsistencies. In particular, a lexeme 
can only be described by a feature structure (FS) that is partially specified. However, this 
means that technically a lexeme is just a description and not an object. Yet the archi- 
tecture of the HPSG lexicon demands that lexemes be bona fide linguistic objects, not 
descriptions of objects. 
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If we simply declare the lexemes as objects then the question arises as to how much 
the lexeme can be underspecified. Building on the defaults-based GPFM model I argue 
that a lexeme is best regarded as a maximally underspecified object, bearing all and 
only those properties which are not predictable from default specifications.! I show how 
the maximally underspecified lexemic representation can help solve the question of the 
status of transpositions such as participles. 

I make a number of background assumptions. 


* A dictionary is a list of lexemes. 


e Inflectional morphology operates according to I-R principles and defines a para- 
digm for each class of lexemes, each cell of which is occupied by a pair (6,0) for 
the set o of morphosyntactic properties realized by the word form o. 


e A fully specified representation of a lexeme includes a specification of a set of 
syntactic properties, a semantic representation (which for convenience I take to be 
a simplified form of Lexical Conceptual Structure, Jackendoff 1990) and a unique 
identifier, variously called the Lexical Identifier (11D), the Lexical Index, or the 
Lexemic Index (11). (This is comparable in function to the lexicographer’s lemma.) 


e The syntactic properties of a lexeme include a specification of its argument struc- 
ture (ARG-ST). 


e The Anc-sr attribute of a lexeme includes a semantic function role (SF role, Spencer 
2013), canonically R for nouns, E for verbs and A for adjectives 


The chapter is structured as follows. I open by outlining four possible ways of repre- 
senting lexemes, the fourth of which relies heavily on the device of defaults and over- 
rides operating over a maximally underspecified entry. The next section addresses the 
question of whether a lexeme can be regarded as an object or not, and how many of its 
properties can be underspecified. 

In 84 I turn briefly to the model of lexical representation proposed in Spencer (2013), 
and specifically to the way in which an inflectional feature declaration (MORSIG, ‘mor- 
pholexical signature") can be defined and deployed in a defaults-based model of lexical 
representation. Against this background $5 addresses the architecturally important ques- 
tion of the place of transpositions such as deverbal participles. These are an important 
test case because they raise questions of lexemic identity and category membership: 
the participle behaves as a ‘quasi-lexeme’, without being the output of derivational lex- 
eme formation proper. I deploy an attribute REPRESENTATION to define transpositions. 
I discuss the way that the adjectival inflectional paradigm can be incorporated into the 
paradigm of a verb by appropriate use of the Morsic attribute. I illustrate with a de- 
scription of the Russian participial system. I contrast the behaviour of true participles 


1This corresponds to Sag’s 2012 notion of listeme. The listeme has a somewhat unclear status in SBCG, but 
Sag explicitly describes it as a description and not an object, so it is not a perfect correspondent to the 
conception of lexeme proposed here. 
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with that of transpositional lexemes (Spencer 2013, 2016), which are derived autonomous 
lexemes formed from transpositions such as participles. 

In §6 I ask how transpositions might be incorporated into a multiple inheritance hi- 
erarchy but note two problems. First, multiple inheritance hierarchies are not straight- 
forwardly capable of distinguishing, say, the adjectival representation of a verb (par- 
ticiple) from the verbal representation of an adjective (inflecting predicative adjective). 
Second, there is in any case virtually no discussion in the morphological literature of 
transpositions and hence no consensus on how their morphological properties should 
be accounted for. I conclude with a tentative list of questions which arise from the dis- 
cussion. 

I will close this introduction with a terminological note. I shall simplify discussion 
wherever possible by assuming the correctness of my approach and taking the lexeme 
to be effectively identical to its description. That is, a lexeme is a dictionary entry, an 
abstract underspecified representation, which we can think of as a meta-representation, 
unifying the concrete representations in the complete set of its word forms. The obvious 
synonym for ‘dictionary entry’ is ‘lexical entry’. However, in constraints-based syntac- 
tic models the notion of ‘lexeme’ is rather poorly developed, and the term ‘lexical entry’ 
is often (though not invariably!) used to refer not to the abstract object listed in a dic- 
tionary but rather to a concretely instantiated inflected word form of a lexeme. This ter- 
minological ploy is confusing, but is now ingrained. Following Dalrymple et al. (2015), 
I shall therefore adopt the term ‘lexemic entry’ for the standard lexicographic notion 
of dictionary entry. I will avoid the term ‘lexical entry’ and refer to the representation 
(fully or partially specified) of an inflected form as the lexical representation of that word 
form. This is more than a question of mere terminology, especially in HPSG, but proper 
evaluation of the issues would require a separate study. 


2 The nature of the lexeme 


In principle there are a good many ways in which dictionary entries can be represented. 
It will be useful to consider four of these. The first possibility is to list every inflected 
form separately with a complete specification of all its properties, whether idiosyncratic 
or predictable. This will include (i) all the morphological properties, such as inflection 
class, (ii) syntactic properties such as argument structure, including the SF (semantic 
function) roles, valence, selection, collocation, lexicosyntactic class features and others, 
together with (iii) contextual properties or properties relating to usage such as register, 
connotations, and other, not strictly linguistic, properties that a competent user would be 
expected to know about the word (what is sometimes called “encyclopaedic information’, 
though this term is difficult to pin down). I shall call this mode of representation the 
unindexed full word form listing model. Some psycholinguistic models of the mental 
lexicon appear to have essentially this structure. It does not define a dictionary entry in 
any direct sense because every word form of every lexeme has the same representational 
status as any other: dog and dogs are only marginally more related to each other on this 
model than are dog and dig or dogs and geese. 
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The unindexed full word form listing model effectively excludes any standard under- 
standing of the notion of dictionary entry, therefore. However, it would be possible to 
reconstruct the traditional notion of dictionary entry by providing all the forms that 
unite under a given lexeme with a unique lexemic index. Thus, dog, dogs would both 
have the index poc, distinct from that of dig (DIG) or geese (Goose). This would then de- 
fine our second model of lexical representation, which I will call the indexed full word 
form listing model. The 11 would have to be a secondary property associated with each 
component of a lexemic entry, FORM, SYN, SEM. At the level of FORM this would mean in- 
dexing the lexeme's root, its various stem forms and all its inflected forms (unless these 
were able to inherit the r1 of their stems). At the level of SYN, SEM each individual sub- 
attribute (syntactic class, argument structure or whatever, depending on one's syntactic 
assumptions) would be furnished with the same r1, as would the basic meaning or lexical 
conceptual structure and any other aspects of meaning. This use of a lexemic index is 
very similar to that proposed by Jackendoff (1997) and integrated into the Simpler Syntax 
model (Culicover & Jackendoff 2005), though their model makes rather different assump- 
tions about the structure of inflected words because it retains the morphemic concept 
and therefore is not strictly speaking lexeme-based. 

These first two models share the property that all inflected word forms are fully listed. 
In such models there is effectively no morphology defining the lexical relatedness that 
holds between word forms ofthe same lexeme. In order to capture formal similarity/iden- 
tity between word forms it would therefore be necessary to postulate lexical redundancy 
rules (Jackendoff 1975, Bochner 1993) or inflectional templates (Ackerman et al. 2009). 

The third model I shall call the fully specified lexemic entry model. The term ‘fully 
specified’ refers to the fact that on this model (along with the previous two models) the 
lexemic entry includes fully predictable information about the FORM, SYN, SEM represen- 
tations as well as unpredictable, idiosyncratic information. For instance, if all syntactic 
nouns in the language are also morphological nouns (i.e. if the language lacks category 
mixing with respect to the noun class) then the property of inflecting as a noun, that 
is, being a morphological noun, can be deduced from the sYNCAT label. However, under 
the fully specified lexemic entry model such a word would still be given the attribute 
[MORCAT noun] or the equivalent as part of its FORM representation. Where this third 
model differs from the previous two is in the important assumption that (regularly) in- 
flected word forms are not included as part of the lexicon as such. Rather, such a model 
follows lexicographic tradition in abstracting away from inflected word forms, instead, 
defining them by means of a separate ‘inflectional engine’, such as PFM. On the fully 
specified lexemic entry model, the lexeme-as-dictionary-entry is accorded a special on- 
tological status, that of a linguistic object. Depending on how such a model is imple- 
mented formally it may or may not be necessary to individuate dictionary entries by 
means of the arbitrary r1 attribute. However, traditional lexicography certainly makes 
use of something very close to an r1 in the form of a lemma or headword. An arbitrary 
label of this sort appears to be the most natural way of individuating entries. 

The fourth model of lexical representation is the underspecified lexemic entry model, 
argued for in Spencer (2013). This model deploys the logic of default inheritance to ab- 
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stract away fully predictable lexical information. The lexemic representation in this case 
includes just the information that cannot be inferred by default from other aspects of the 
representation or from other facts in the grammar of the language. Thus, in our previous 
example, if the specification [MorcaT noun] is fully predictable from the specification 
[SYNCAT noun] then the Morcar specification need not be stated in the lexemic entry 
itself (indeed, there need be no mention of the attribute MORCAT at all). 

To see how the underspecified lexical entry model might define dictionary entries, 
consider a word such as TREE. This minimally has to specify a phonological form for the 
basic stem form (root), STEMg = /tri:/, as well as minimal information about the kind of 
meaning the word has. As far as morphosyntax and especially inflection is concerned it 
hardly matters, of course, what kind of a thing a tree is (much less where to draw the 
line between trees and bushes). Also, the difference between abstract and concrete deno- 
tations seems to have little grammatical import, in English. However, it is important to 
know that TREE denotes some type of Thing and that it is countable, in contrast to words 
such as VEGETATION, or WOOD (in the sense of ‘material coming from a tree’). Informally, 
we can distinguish count Things and mass Things with a subscript: Thing,/Thing,,. How- 
ever, for English we should also have some way of representing the fact that TREE (and 
IDEA) denotes something which is not a sexed higher animal, such as a person or a horse 
and which therefore can only be referred to as it, not as s/he. In languages which dis- 
tinguish a 'vegetable' gender (e.g. Bininj-Gunwok) we might need to indicate the fact 
that TREE (and perhaps VEGETATION but not IDEA) denotes a kind of plant. In other lan- 
guages with semantically-driven gender other distinctions would have to be made. These 
observations hold for the determination of inflectional properties. However, for a specifi- 
cation of derivational morphology it is often necessary to appeal to very subtle nuances 
of meaning (Fradin & Kerleroux 2003). 

The point of this discussion of lexical semantics is that once the right semantic prop- 
erties are fixed much of the rest of the lexemic representation can be deduced by default. 
Thus, if an English lexeme belongs to the Thing ontological category (as opposed to the 
category Event or Property) then by default it will be a noun, with an argument structure 
that includes the SF role R. A syntactic noun will also be a noun morphologically, and if 
it is of subcategory Thing, it will have a singular and plural form. This is more than just 
a modern version of the notional parts-of-speech theory, however. Being defaults, all 
these inferences can, of course, be overridden by more specific lexical stipulations. Thus, 
a noun such as JOURNEY is ontologically an Event but grammatically it is a noun, so that 
the inference from Event to SF role E to [SYNCAT/MORCAT verb] is overridden in the lex- 
emic entry (for instance, by stipulating that its SF role is a simplex R). Moreover, in many 
languages there will be non-default morphological information to stipulate in addition 
to the phonology of the root. For instance, the Russian noun STOLOVAJA ‘canteen; dining 
room’ is anoun syntactically, but it has the morphology of a (feminine gender) adjective, 
thus its [MORCAT adjective] value cannot be inferred from its [SYNCAT noun] value and 
has to be stipulated in the lexemic entry in some way. In some cases, not all argument 
structure or complementation properties can be deduced from the semantic representa- 
tion so those would need to be specified lexically. Some of the contextual properties of a 
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lexeme such as special register, connotations, or other details of usage may also diverge 
from the default and will therefore have to be recorded in the lexeme's entry. But the 
limiting case of a lexical representation in the underspecified lexemic entry model is a 
pure pairing of basic meaning with the form of the root (what Sag 2012 refers to as a 
listeme’; see 83) . 


3 Lexemes as objects or descriptions 


The principal question to be addressed in this paper is: what kind of a representation is 
a dictionary (lexemic) entry? Specifically, is it a linguistic object in its own right? In this 
section I discuss the answers proposed in Sag's (2012) summary of SBCG. 

In SBCG, as in HPSG generally, a distinction is drawn between linguistic objects and 
the representational technology used to describe those objects, notably feature struc- 
tures (FSs) or attribute-value matrices (AVMs). An inflected word form, for example, is 
a linguistic object, but it can be described in various ways, including partial feature de- 
scriptions which underspecify certain aspects of the representation. A linguistic object 
proper, however, cannot be thus underspecified. This means, for instance, that Sag's lis- 
teme, the barest possible representation of a lexemic entry, must be a description, not an 
object in its own right. 

Sag (p. 98) introduces the notion of the lexeme into the model, giving it a special place 
in the type hierarchy of signs shown in Figure 1. This hierarchy defines the lexeme as 
a lexical sign, just like a word form. However, word forms appear as parts of syntactic 
phrases which can ultimately be pronounced, and so they count as linguistic expressions. 
A lexeme cannot be pronounced. This is not because it is some kind of ‘covert expression’, 
however (like gap and pro in Sag's type hierarchy). A lexeme is an altogether different 
kind of sign, in fact, a unique type given the hierarchy in Figure 1. 


sign 
expression lex-sign 
covert-expr overt-expr lexeme 
gap pro phrase word 


Figure 1: Sag’s (2012) type hierarchy 
Sag provides examples of representations of word forms from English (plurals, past 


tense forms) and in his Fig. 6 (p. 101), here reproduced as Figure 2, he gives the example 
of the lexeme LAUGH. Notice that this representation actually seems to specify the word 
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sintrans-v-lxm 
PHON  /laf/ 
FORM (laugh) 
ARG-ST (NP; 
syn-obj 
verb 
SELECT none 
VE psp 
XARG NP; 
SYN laughing-fr 
CAT LABEL 14 
LID 
SIT s 
S-SRCE i 
MRKG unmk 
VAL (NP) 
sem-obj 
IND S 
LTOP l4 
SEM laughing-fr 
LABEL 14 
FRAMES 
SIT S 
S-SRCE i 


Figure 2: Sag's (2012: 111) representation of the lexeme LAUGH 
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form laughed, in that it bears the feature [vForM psp]. It is worth citing Sag’s justification 
for this choice of representation: 


[T]he value psp illustrated here [...] represents an arbitrary expositional choice — 
any value of vronM would satisfy the requirements imposed by the laugh listeme. 
And each such choice gives rise to a family of well-formed FSs licensed by that 
listeme. (Sag 2012: 99) 


Sag here appeals to the LAUGH listeme. In SBCG a listeme licenses modelled linguistic ob- 
jects. This means that it places restrictions on what properties a modelled object or sign 
may have (p. 105). Another way of characterizing the listeme is as “a lexeme description 
in the lexicon” (p. 107). 

The type lexeme plays a central role in SBCG, in that it is the starting point for all mor- 
phology (Sag is here following PFM and related models). Inflection and derivation are 
modelled by means of morphological functions. An inflectional rule such as the English 
preterite (past tense) is modelled by a preterite-cxt, whose mother is the past tense form 
and whose daughter is the lexeme whose past tense form is being defined. A derivational 
rule is given by a construction whose mother is the derived lexeme and whose daughter 
is the base lexeme. 

Sag summarizes the morphological functions by saying (p. 113) that they express “<...> 
the relation between the forms of two lexemes or the relation between the form of a lex- 
eme and the form of a word that realizes that lexeme." This sounds like an expression 
of conventional wisdom in lexeme-based morphology, but it hides a serious conceptual 
flaw. This centres around the way that Sag’s formulation uses the term ‘form’. The prob- 
lem is apparent from Sag’s description of the lexeme LAUGH. He is obliged to provide this 
representation with an arbitrary inflectional feature specification, in effect defining not 
the lexeme as such but one of its inflected forms. This is because a lexeme is meant to be 
a modelled object, a subtype of sign, and a linguistic object must be fully specified. But 
the whole point of defining a lexemic level of representation is to abstract away from 
actual (concrete) word forms. This means that the lexeme is effectively a description, in 
fact a partial description, of the full set of word forms. But that is completely incompati- 
ble with Sag's type hierarchy and, indeed, with any coherent interpretation of the HPSG 
lexicon. 

Given this reasoning we seem to have two logical courses of action. Either we can re- 
construct the HPSG lexicon without recourse to the type lexeme, or we can redefine the 
notion of linguistic object in such a way as to make a dictionary entry a kind of modelled 
object, even though it appears to be underspecified. I shall adopt the second approach. 

I propose to treat the lexicon as more than just a convenient descriptive fiction, as 
would be implied by a strict application of the object-description distinction. Rather, 
I take the lexicon to be a network of mentally represented (or representable) objects 
which can be defined and described by FSs just like (utterable and unutterable) linguistic 
expressions. 

By simply declaring a dictionary (lexemic) entry to be a kind of object we solve the 
immediate problem: the lexeme can remain a type of sign, and can be a supertype of other 
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signs. Its unusual position in being partially underspecified is now reflected in the type 
hierarchy: only the expression type has to be fully specified, a lexical sign may be only 
partially specified (lexeme), though when a lexical sign is also a subtype of expression 
(word) it, too, can, and must, be fully specified. 

Now, once we admit the possibility of an underspecified entity as an object in the 
linguistic ontology we are immediately faced with two sets of questions. The most gen- 
eral of these is ‘are there other linguistic objects which can be less than fully specified? 
Can any partially specified representation be interpreted as a modelled object? If so, 
then what is the content of the original object~description distinction?” It seems that we 
should not be allowed to postulate such objects except in very special circumstances. But 
if we admit lexemes as less than fully specified objects what prevents us from postulating 
entirely arbitrary types? The simplest answer is to say that it is an architectural (i.e. stip- 
ulated) property of linguistic expressions that they be fully specified. However, whether 
this is really true may depend on how we perceive linguistic specification. Presumably, 
an object of type word such as dogs is to be regarded as a fully specified object and not 
a description, even when, for instance, its intonation and other prosodic characteristics 
are not specified. But in the strictest sense a word form remains partially underspecified 
until its full phonetic realization is given. Indeed, the same is true of sentences, which can 
be uttered with a very wide variety of affective intonation contours even when realizing 
one and the same set of discourse or information-structure functions. 

The second question is more immediately relevant: if we are to admit as an object a 
lexeme underspecified for its inflection properties, how much further can we go with the 
underspecification? For instance, we might want to say that our lexeme LAUGH is under- 
specified for its inflectional properties by virtue of bearing the attribute values [TENSE 
U, VFORM U, SUBJAGR u, ...] or whatever, where ‘u’ means ‘not yet specified value’, or 
we may wish to make the more radical proposal that rAucn lacks the actual attributes 
[TENSE, VFORM, SUBJAGR, ...]. This may turn out to be little more than a matter of nota- 
tional convention, but in a more radical vein we can ask why we can't regard Sag's maxi- 
mally underspecified listeme as a default lexeme object. In other words, can we not adopt 
the underspecified lexemic entry model for dictionary entries, as proposed in Spencer 
(2013)? We will see that the question assumes particular importance in defaults-based 
models of morphology such as PFM, where the lexeme concept finds its most elaborated 
implementation, and especially GPFM, where defaults define all aspects of lexical rep- 
resentation. Before turning to a consideration of the lexeme concept in such models I 
first discuss an important but generally neglected aspect of lexical representation and 
its relation to inflectional morphosyntax. 


4 The morpholexical signature (MORSIG) 


A lexeme of a given morpholexical class, such as ‘noun’, will (typically!) inflect for prop- 
erties particular to that class (say, NUMBER, CASE, DEFINITENESS, POSSESSOR AGREEMENT) 
and may have intrinsic properties which determine its morphosyntax, such as GENDER. 
The actual set of properties is stipulated for each language, so a grammar has to include a 


286 


12 On lexical entries and lexical representations 


declaration of that set. In the Generalized Paradigm Function Morphology(GPFM) model 
of Spencer (2013) I refer to this declaration as the morpholexical signature (Monsi6). In 
GPFM the morsic attribute is itself treated as a default property with respect to lexemic 
entries/representations. By this I mean that the properties which make up the Morsic 
are true of every regular lexeme of the given class, so it would be redundant to specify 
that information in the lexemic entry itself. 

In Spencer (2013) I treat the MORSIG as a value of the ror attribute, i.e. as a morpho- 
logical property of a lexeme, but this is an oversimplification. It is well-known that the 
set of features needed to define a lexeme's syntactic distribution, and the set of gram- 
matical meanings expressed by inflected word forms, are often at variance with the set 
of features needed to define the inflected morphological forms themselves. The most ob- 
vious mismatches are found in periphrases. We often find that the morphological form 
of one of the elements of the construction bears properties which contradict the feature 
content expressed by the periphrasis as a whole. Elsewhere, the morphological element 
may be morphomic and therefore not associated with any meaning, or the periphrasis 
may express a meaning in the manner of an idiom, so that no part of it can sensibly be 
associated with the meaning of the periphrasis as a whole (Brown et al. 2012). Periphra- 
sis therefore motivates a distinction between m-features and s-features (mnemonically, 
morphological/syntactic features, Sadler & Spencer 2001). Similarly, Stump has argued 
for a modification of the original Paradigm Function Morphology (PFM) model, ‘PFM1’ 
(Stump 2001), in favour of a model, 'PFM2', which draws a distinction between FORM 
and CONTENT paradigms, on the basis of mismatches such as syncretisms, deponency 
and a variety of others (Stump 2002, 2006, 2016a,b). The obvious way to capture such 
distinctions in lexical representations is to assume that there is a SYN|MORSIG attribute 
which is mapped to a FORM|MORSIG attribute by means of a function, Stump's ‘paradigm 
linkage’. By default, paradigm linkage is the identity function, in the sense that the FORM 
paradigm or m-feature set is identical to the CONTENT paradigm/s-feature set. 

In GPFM the relation between the most highly underspecified lexical representation 
and a fully specified word form is mediated by two sets of functions. The second of these 
is effectively identical to the paradigm function of PFM2. It maps a pairing of <L,o), for 
LI L, feature set c, to a pair (0,0), where w is the corresponding inflected word form. This 
function is, however, only defined for a complete and coherent feature set. In other words 
the function cannot be defined for a representation which lacks a specification of those 
features for which the lexeme inflects, that is, the Morsic. Therefore, to be inflectable 
the lexeme's MORSIG attribute needs first to be specified (Inflectional Specifiability Prin- 
ciple, Spencer 2013: 199). This is achieved by the first of the two functions, the default 
specification of MonsiG for a given morphosyntactic lexical category. 

An illustration of how this works can be given by (a simplified version of) the Turk- 
ish noun (following the discussion in Stump 2016a: 175-179). The minimal lexical infor- 
mation needed for, say, the word Ev ‘house’ is shown in Figure 3 (using English as a 
metalanguage). Turkish grammar stipulates that a count noun inflects for the properties 
shown in Figure 4. The FORM|MORSIG attribute is almost identical except for a well-known 
syncretism between the 3sg possessed form of ‘houses’, and the 3pl possessed forms of 


287 


Andrew Spencer 


‘house/houses’ and the ordinary unpossessed plural. We would expect these to take the 
forms evler, evlerler, evler respectively, but the form evlerler is reduced by haplology to 
evler. Clearly, the FoRM paradigm makes fewer distinctions than the CONTENT paradigm. 


FORM 


STEMg [enon /ev/ | 


SEM [ming Ax.house(x)] 


LI HOUSE 


Figure 3: Lexemic entry for Turkish Ev ‘house’ 


NUMBER { sg,pl} 


CASE { nom, acc,gen,dat,loc } 
SYN|MORSIG 
PERSON {1,23} 
POSS 
NUMBER { sg,pl} 


Figure 4: MORSIG for Turkish count noun lexeme 


In PFM2 this mismatch is defined via a Correspondence function, Corr, which speci- 
fies the distinct Form features and CONTENT features and which defines the mismatches 
giving rise to syncretism, deponency and so on. The details are not relevant here so I 
simply assume the existence of the Corr mapping. 


5 Lexical relatedness and the role of the Lexemic Index 


The notion of lexemic representation (lexeme, lexical entry) plays an important role in 
the I-R class of models. This is especially true of GPFM, because that model attempts 
to unify inflection with (regular, productive, paradigmatic) derivational morphology. If 
we say, for the sake of argument, that English Subject Nominal (SubjNom) formation is 
paradigmatic then we can define it by recourse to a derivational feature (cf. Stump 2001: 
257) SN, such that the generalized paradigm function, GPF, will map a verb lexeme to its 
subject nominal: GPF(<L, sn?) = <L’, sn), where L’ is the 11 of the subject nominal of the 
verb L. However, the GPF cannot apply in exactly the way that the PF applies in PFM2. In 
PFM2 the PF maps a pairing of <L1,features) to a word form (via the Corr function). But 
the output of a derivational function has to be some representation of an independent 
lexeme. This means that when a derivational feature is in the domain of the GPF it must 
map to a representation of that derived lexeme, not to a word form. But the standard 
architecture of PFM2 (including the Corr function) does not permit this. The problem is 
at heart very familiar: while inflectional morphology specifies word forms that realize 
the particular morphosyntactic property set of a lexeme, derivational morphology effects 
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wholesale changes in syntactic and semantic representations, undermining the basic I-R 
assumptions under which morphology simply serves to realize property sets. 

In the GPFM model of Spencer (2013), derivational morphology requires the GPF to 
perform a kind of ‘deletion’ of the base lexeme’s properties, followed by respecification 
by means of defaults driven by the enriched sem representation of the derived lexeme. 
However, a more parsimonious way to represent derivational morphology is to map the 
maximally underspecified base lexeme’s entry to a maximally underspecified derived 
entry. This obviates the need to delete most of an entry’s specifications, in that they are 
lacking in any case. Thus, for the lexeme DRIVE and its SubjNom DRIVER a schematic ap- 
plication of the GPF would be as in Figure 5 (where sN(DRIVE) is a function from Lis to LIS 
governed by the derivational feature, defining the r1 of the derived lexeme, DRIVER). This 
type of application can be thought of as an elaborated, feature-driven word formation 
rule (wfr) in the sense of Aronoff 1976. 


STEMg|PHON /drarv/ 
FORM |STEMp.;|PHON /droov/ 
STEMpsrprcp PHON —/driv/ 


SEM [Event Ax, y.drive(x, al 


LI DRIVE 


| 


FORM [sremolpHoN /drarv/e/ə/| 


SEM [ming Ax[person(x) ^ 3y.drive(x, il 


LI SN( DRIVE) 


Figure 5: Derivation of DRIVER from DRIVE 


Now, the output of the GPF is the representation of a Thing, so by default it will have 
all the morphosyntactic properties of a noun.” In languages with nominal inflectional 
classes the GPF may additionally have to specify which inflectional class the derived 
noun belongs to, as a FORM property overriding whatever the default specification for 
noun inflection class is, just as would be the case with a simplex (underived) lexemic 
entry belonging to a non-default inflectional class. The function in Figure 5 fails to trans- 
fer the non-default (stipulated) specification of the past tense and past participle stems 
from the base verb to the subject nominal, giving rise to a kind of despecification. There 
is an important rationale behind the despecification of lexemic entries in Spencer (2013): 
derivation, unlike inflection, leads to lexical opacity. Thus, the derived lexeme DRIVER 
lacks any specification which would identify it as having a base with past tense or past 
participle forms, irregular or otherwise, or, indeed, any of the morphosyntactic proper- 


2DRIVER is a count noun, of course. I assume that this can be made to follow from the fact that a driver is a 
subtype of person. 
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ties associated with a finite verb. In this case the failure of the past and past participle 
forms to be inherited by the derived noun is the consequence of the definition of the 
MORSIG attribute for nouns as opposed to that for verbs. The GPF for SubjNom specifies 
exactly one STEMg form (for regular lexemes). This can be unified with the default Morsic 
specification associated with Thing lexemes. Since the Thing ontological category does 
not license inflectional (s-feature) paradigm properties other than NUMBER in English, 
there would be no way for any tense or participle features to unify with the MORSIG at- 
tribute once it is specified. The only additional assumption that we need to make here 
is that SubjNom derivation is the kind of lexical relatedness which defines an entirely 
new MORSIG (i.e. one which ‘deletes’ the Morsic of the base entry). I return later in this 
section to the question of how we characterize the class of relatedness functions which 
fail to preserve the base lexeme's MORSIG attribute in this way. 

In true derivational morphology the r1 of the output lexeme is always distinct from 
that of the base. This reflects the most significant difference between derivational types 
of lexical relatedness, on the one hand, and types of lexical relatedness broadly thought 
of as inflectional, on the other hand: derivation defines new lexemes while inflection de- 
fines forms of lexemes. However, in GPFM, preservation or alteration of the Lr is just one 
parameter of relatedness, almost entirely independent of other parameters (this is the 
Principle of Representational Independence, Spencer 2013: 139). In particular, we system- 
atically encounter two types of situation in which the crucial feature of the relatedness 
is the preservation or change of the base lexeme's r1. 

The first of these is the class of relatedness types called transpositions, in which the 
morphosyntactic class of a word changes, as in typical derivation, but in which there 
is no creation of a novel lexeme with a distinct LI. In a canonical transposition the SEM 
value, that is, the conceptual content of the representation, does not change either. 

The second type of case is very similar. Here the lexical relation defines a distinct lex- 
eme but does not alter the conceptual content of the base. These are what I have called 
transpositional lexemes (Spencer 2013: 275; 359-60; Spencer 2016). Simple examples are 
adjectives derived from participles such as interesting, bored or so-called relational ad- 
jectives (in English and other European languages) such as prepositional, ferrous. These 
contrast with superficially similar cases in which the derived adjective differs seman- 
tically from its (etymological) base: budding (linguist), harrowing (experience), gaping 
(hole); outspoken, unspoken, incensed, poised; popular (= ‘well-liked’), spectacular. Distin- 
guishing true transpositions from transpositional lexemes and transpositional lexemes 
from other, often homophonous, adjectives is important for understanding the nature of 
lexical representations and types of lexical relatedness. In some cases, the only difference 
between the lexical representation of a true transposition and that of the homophonous 
transpositional lexeme is the difference in LI. However, in many cases the transposi- 
tional lexeme has different syntactic privileges from the homophonous transposition by 
virtue of being an independent lexeme. For instance, the adjective interesting has the 
complementation properties of an adjective, not of a verb or a true participle, as seen by 
comparing the true participle in (1) with the true adjective in (2). 
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(1) interesting = participle 
a. the book ("very) interesting the children 


b. * The book seems interesting the children. 


(2) interesting = adjective 
a. the book most interesting to the children 


b. The book seems interesting to the children. 


Comparable examples can be found with Russian participles and participial lexemes. 

A clear instance of a true transposition is the (deverbal) participle, familiar from 
many languages, including almost all Indo-European languages. In Russian, for instance, 
we find four participles, realizing the properties [voice (act, pass}], [ASPECT {pfv, ipfv}] 
(Spencer 2017). These inflect exactly like adjectives and their principal function is that of 
attributive modifier to a noun. However, in addition to expressing the verbal properties 
of voice and aspect the participles also retain the argument structure/complementation 
of the base verb, including quirky case assignment. They are thus prototypical examples 
of mixed categories. 

In Spencer (2013, 2017) I argue that participles belong to the base verb’s paradigm in 
the broadest sense, and that this means their r1 is that of the base verb. In an I-R model 
this means that the participles are defined by a <feature, value) pair, just like tense or 
number forms, and I propose the feature REPR(ESENTATION), following Russian descrip- 
tive tradition (see, for instance, Kuznecova et al. 1980, Helimski 1998 for the Samoyedic 
language Selkup, which is particularly rich in transpositions; see also Haspelmath 1996). 

Following Spencer (2017) I notate the feature as REPR(K,A), denoting a transposition 
from category K to category A. For example, a participle would be defined by the feature 
REPR(V,A).? The GPF(<V ,{REPR<V,A),o}>) applies to a verb lexeme V and defines a partici- 
ple realizing features o. For instance, the Russian perfective passive participle udar'onn- 
from UDARIT ‘hit, strike’ is defined by (3). 


(3) GPF((upanrT ,{REPR<V,A),{[ASPECT pfv], [VOICE pass]}})). 


The GPF (3), however, only defines the stem of the participle. In order to inflect it as an 
adjective it must be given an appropriate MonsiG, inheriting CONCORD (agreement) fea- 
tures from the adjective class, permitting the participle to agree with the head noun. This 
addition to the Morsic is an automatic consequence of redefining the morphosyntactic 
class as adjective. The technical details of exactly how this is achieved are provided in 
Spencer (2017). The GPF which defines the stem of the participle defines a lexical rep- 
resentation which is thus very similar to that of a (maximally underspecified) simplex 
adjective before it receives the default Morsic specification. In this way the participle re- 
sembles an automonous adjectival lexeme, whilst remaining a form (better, the adjectival 
representation) of the verb, what we could call a ‘quasi-lexeme’. 


3The labels ‘V, A’ are for convenience. In fact, it is likely that all ‘capital letter’ lexical/phrasal (‘c-structure’) 
category labels (N, V, A, P) can be dispensed with, in favour of appeal to more fine-grained properties, 
especially the SF roles (Spencer 1998, 1999, 2013: 322-23; see also Chaves 2014 for similar remarks) . 
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Here is, in broad outline, how the GPF would deliver the quasi-lexeme form udar'onn- 
. A (partial) FS for the Morsic of a typical transitive verb is shown in Figure 6. The FS 
in Figure 6 shows those morphosyntactic properties that are reflected in the grammat- 
ical system of Russian. It does not, however, tell us what the inflected forms are. This 
is because that FS defines the CONTENT paradigm feature set, not the FORM paradigm 
set. For instance, [TENSE fut] is only expressed morphologically in [ASPECT pfv] verb 
forms; in imperfective verb forms future tense is expressed periphrastically. Similarly, 
[VOICE pass] is only expressed synthetically in imperfective verb forms (where it actually 
borrows forms marked [REFLEXIVE yes]); in perfective verb forms it is expressed again 
periphrastically. 


ASPECT { pfvipfv } 
VOICE { act,pass } 


TENSE { prs,pst,fut } 
SYN | MORSIG| ` 


ASPECT { pfvipfv} 
EE voice { act,pass} 


Figure 6: Partial Morsic for a Russian transitive verb 


The somewhat complex mapping between CONTENT and ForM paradigms in Russian 
verbs is explored in greater detail in Spencer (2017). The precise characterization of the 
FORM or m-features for Russian verbs is controversial (as it is for most languages, in- 
cluding English). In Spencer (2017), for instance, I argue that the FORM paradigm has 
a single-valued m-[TENSE prs-fut] feature, accounting for both the present tense inflec- 
tions of imperfective verb forms and the (identical) future tense inflections of perfective 
verb forms. Likewise, the CONTENT paradigm feature s-[TENSE pst] is expressed by a 
morphomic l-participle form ([vForM Iptcp]), which has no semantic interpretation of 
its own but which co-realizes s-[Moop conditional] in conjunction with the particle by. 
Elsewhere, by default the l-participle realizes the CONTENT paradigm s-[TENSE pst] fea- 
ture value. The specification [TENSE pst] has no FoRM/m-feature counterpart. 

The partial specification in Figure 6 also shows us that a transitive verb in Russian has 
four participial forms, listed in Table 1, where the parenthesized suffixes (-ij, ...), (-yj, ...) 
indicate the agreement inflections. 


Table 1: Participles of Russian UDAR'IT ‘hit’ 


udar -aju-sé(-ij, ..)  imperfective active 
udar -aje-m(-yj, ...)  imperfective passive 
udar -i-v&(-ij, ...) perfective active 
udar-on-n(-yj,..) perfective passive 
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Given the Morsic in Figure 6 the GPF can apply to a pairing <U,m>, where U is 
the L1 of UDARIT and z is a mnemonic shorthand for the set of participial features 
{[REPR<V,A)],{[AsPEecT pfv]. [voice pass]}. In the original PFM models (PFM1 and PFM2) 
the paradigm function serves solely to define inflected forms (and periphrastic realiza- 
tions of certain inflectional features). In terms of the lexical representational schemas 
discussed so far this means that the PF operates solely at the level of the Form attribute. 
In GPFM the PF is generalized to four functions, operating over the FORM, SYN, SEM, LI 
attributes. The first of these, fform» is the classical PF. For ordinary inflectional morphol- 
ogy the fn; fsem fj; functions have no material effect and behave like identity functions. 
Thus, the GPF for pure inflection collapses with the classical PF. However, for paradig- 
matic derivational morphology all four functions can introduce non-trivial changes as 
we saw earlier in the case of the derivation of DRIVER from DRIVE. 

The case of transpositions such as participles is midway between that of pure or 
canonical inflection and derivation. The LI and sEM attributes remain unchanged but 
both ronM and syn attributes have to be (re-)specified. Following Spencer (1999, 2013), 
in Spencer (2017) I assume that the category of a transposition is defined in terms of 
a complex SF role. A simplex verb has the SF role [Anc-sT|SF E] and an adjective the 
SF role [ARG-sT|SF A]. A participle is the adjectival representation of a lexeme with SF 
role E. The notion 'adjectival representation' is captured by defining a complex SF role 
(AXE. To simplify the exposition I shall assume that the complex SF role is cashed out 
as a complex category label, [A [v]] (at the syn level syncat|[A [v]], at the Form level 
morcat|[A [v]]).* The GPF for a participle, as defined by the attribute REPR<V,A) will 
define a form with this new category, as shown in (4). 


(4) fsyn(<U,7)) =... 
[syn|syncat V] = [svN|svNcar [A [v]]] 


The transpositional feature specification x will also define a restatement of the MORSIG 
attribute for the participle, as shown in (5). 


(5) [aspect], [voice] c [syn|Morsic] 


The statement in (5) is more specific than the default specification and hence it will 
override that default. However, the participles in Russian (unlike some languages) are 
actually adjectival forms. Therefore, their lexical representations must include a feature 
defining their agreement properties, which for convenience I will label concorp. This 
feature must be included there, in the participle's Morsic. However, that fact, together 
with the definition of [concorp], is inherited from elsewhere in the grammar in the 
definition of adjectival inflection, as shown in (6). 

(6) a. SF «A... — [concorp] c [syn|Morsic] 


b. [NUMBER], [GENDER], [CASE] c [CONCORD] 


‘Tn fact, it seems that the device of complex SF roles allows us to dispense entirely with traditional syntactic 
category labels (see also footnote 3). 
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Declaration (6) is so formulated that it applies to any word type whose ‘outermost’ cat- 
egory label is defined by the complex SF «A .... This will trivially include simplex ad- 
jectives, of course, but it also includes (true transpositional) participles (SF A4E)»)) and 
true relational adjectives (SF (A4R))). Russian participles are well-behaved morphologi- 
cally and so they will inherit very nearly all the FORM|MORSIG properties implied by the 
SYN|MORSIG specification. 

We are now in a position to state the full GPF defining the perfective passive participle, 
an extension of the GPF shown schematically in (3). This is shown in (7). It defines the 
object represented by the FS given in Figure 7. 


(7) GPF for the perfective passive participle of UDARIT ‘hit’ 
Where U is the Lexemic Index of the lexeme UDARIT ‘hit’ and x is the feature set 
[REPR(V,A) [ASPECT pfv, VOICE pass]], the passive perfective participle stem form 
is defined by a generalized paradigm function, GPF((U,7)) = 


(i) fform(<U,1)) = 


[FORM STEMppp = PHON STEMg(U)eonn = /udar'onn/] 
(ii) fU.) = 


SYNCAT 


sb 


syn |ARG-ST <(x), y? 


ASPECT  pfv 
VOICE pass 


MORSIG 


where (x) denotes the suppressed external argument of the passive. 


(ii) f,,,,(U,7)), fj(U,)) are the ‘identity function’ (no change in 
representation). 


The redefinition of the Morsic attribute to include two attributes inherited from the 
verb base together with the new concor attribute is part of the morphosyntactic defi- 
nition of ‘participle’ in Russian. However, the subsequent inflection of the participle as 
an adjective follows entirely from the more general characterization of adjectives, inde- 
pendently of their origin. For instance, it is equally applicable to a purely derivational 
adjective such as svet-l-yj ‘bright, light’ from svet ‘light’, or krov-av-yj (režim) ‘bloody 
(regime) from krov’ ‘blood’. This means that the participle feature ensemble x defines an 
underspecified lexical representation which has exactly the same type of structure as an 


`The main caveats here concern participles used as predicates, where there are a number of restrictions. The 
participle also retains crucial verb properties such as complementation and even quirky case assignment, 
so we need to ensure that those properties are inherited by the participle when the GPF is applied to x. 
This would require a much more detailed discussion of the lexical representation of verbs, so I refer the 
reader to Spencer (2017) where some of those details are worked out. 
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FORM [TEM ppplPHON fudar'onn/] 


ap 


SN ARG-ST (x) y} 


SYNCAT 


ASPECT PFV 
MORSIG 
VOICE PASS 
SEM ‘hit’ 
LI UDARIT 


Figure 7: “Quasi-lexemic” feature structure for Russian passive perfective par- 
ticiple udar onn 


STEMppp [PHON ` /udar onn/ 


FORM |MORCAT 1 [pect adj 
MORSIG 2 
SYNCAT [1]]A H 


ARG-ST (x)v 


ASPECT pfv 


SYN 
VOICE pass 
MORSIG |2 NUM 
CONCORD |GEND 
CASE 
SEM ‘hit’ 
LI UDARIT 


Figure 8: Feature structure for passive perfective participle udar'onn after de- 
fault specification of MORSIG 
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independent simplex or derived adjectival lexeme. It is in this respect that the participle 
behaves as a quasi-lexeme, having the inflectional and morphosyntactic potential of an 
adjective but remaining a ‘form’ (more precisely, representation) of the base verb. 

The analysis now brings us back to one of the questions posed earlier — is the repre- 
sentation in Figure 7 an object or a description? 

If we regard Figure 7 as a description (vs. object) then it would presumably have to 
describe an object of type word. But this would entail that it describes some particular 
inflected form, say, the feminine instrumental plural. But the participle is not specified 
for those or any other concor features, just as Sag’s FS for LAUGH is underspecified for 
any inflectional feature set. This makes the participle FS look exactly like a lexemic entry, 
which ex hypothesi is an object not a description. It is this object that I have informally 
referred to as a quasi-lexeme. However, from the perspective of the grammatical system, 
it is a lexeme, albeit not one which is independent of its verb base. 

The participle shares its Lexemic Index with the base verb in all its inflected forms. 
However, it is easy to imagine such a representation undergoing the simplest type of 
lexicalization, namely, to acquire its own unique LI. This would happen if the participle 
were recategorized as a simplex adjective, that is a member of the morphosyntactic cat- 
egory [A] rather than [A [v]]. This is then the representation of a transpositional lexeme 
of the type interesting. Russian, too, has such converted participial lexemes, though they 
often do not correspond to English transpositional lexemes. Examples are potr asájuscij 
‘amazing’ from potr'asát' ‘to amaze’, izmüconnyj ‘exhausted’ from izmuëit ‘to exhaust’ 
and many others (see Spencer 2017 for further discussion) . The crucial point is that these 
derived adjectival lexemes do not seem to differ from their verb bases in their semantics, 
just like true transpositions, yet they behave syntactically like independent lexemes. 


6 Lexemes and types 


We have arrived at the conclusion that the lexical representation of a participle is non- 
distinct in crucial ways from the representation of a lexeme, and for this reason the 
grammar will treat it as a linguistic object, akin to a lexeme. This invites the conclusion 
that the participle is, in fact, a subtype of the type lexeme in the hierarchy proposed by 
Sag (2012), say, ptcp-Ixm. The problem would then be to define where ptcp-lxm fits in 
the type hierarchy. A participle inherits from both adjectives and verbs, as illustrated in 
Figure 9, adapting Sag's hierarchy for English (with obvious modifications for Russian). 
This would be in keeping with Malouf's (2000) approach to deverbal nominalizations. 
However, there are a number of problems with this solution. One of these relates to the 
‘directionality’ or ‘headedness’ of transpositions: a transposition is a representation of 
its base lexeme. In that respect a participial quasi-lexeme bears the same relationship to 
a verb that, say, the past tense form bears. But this is not captured in a hierarchy such as 
that sketched in Figure 9, where the relation between verb-lxm, adj-lxm, the two mothers 
of the participle ptcp-Ixm, is equal. As a result, there will be no way of distinguishing 
between the adjectival representation of a verb and the verbal representation of an ad- 
jective (that is, a transpositional predicative adjective heading a finite clause and bearing 
inflections for verb features such as tense-mood-aspect-polarity or subject agreement). 
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linguistic-object 


sign 
nm dii ee 
lex-sign expression 
pi m 
lexeme pa i 
eee T 
SE verb-lxm adj-Ixm word 
e c 
M 3pm ne 


Figure 9: Revised partial type hierarchy 


Perhaps, then we should adopt a different approach. Since participles are morpholog- 
ically derived we can set up a construction type in SBCG (or a lexical rule in standard 
HPSG) which would perform the same role as the GPF applied to the REPR feature in 
GPFM. Sag defines two sorts of morphological construction relevant to us in this con- 
text, the infl-cxt and the deriv-cxt. 


(8) infl-cxt: (Sag 2012: 115) 


MTR word 
DTRS list(lexeme) 


MTR lexeme 


(9) deriv-cxt: (Sag 2012: 119) 


DTRS list(lex-sign) 


The formulation in (9) additionally permits derivation from word forms, but in general 
derivation is defined over lexemes and to simplify the discussion I will assume that this is 
always the case. If we take a participle to be a subtype of lexeme, then participle formation 
will be a subtype of the derivational construction shown in (9). 

One issue that has to be resolved when incorporating morphological models into lexi- 
calist syntactic models arises from the fact that I-R models of morphology are generally 
based on default inheritance logic, while the syntactic models generally avoid the use 
of defaults and overrides. An important proposal for marrying the two systems is given 
by Bonami & Samvelian (2015) in the context of analysing periphrastic constructions in 
Persian (see also Bonami & Webelhuth 2012). The details depend on the specifics of their 
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analysis, but the overall import of their proposal is a ‘meta-constraint’ on signs of type 
word, such that a word is licenced in the (HPSG) syntax only if a corresponding repre- 
sentation of it is also licensed in the (PFM) morphology (Bonami & Samvelian 2015: 32). 
In effect, they treat the PFM morphology as a ‘black box’ whose outputs bear properties 
that can be recognized by the syntax. 

The interface for canonical inflection works well. However, the proposals do not touch 
directly on other types of morphology, notably derivation and transpositions. Presum- 
ably, the interface principle could be extended so as to apply between a morphological 
engine and the HPSG lexicon. A major problem here is the lack of consensus over how to 
handle derivational morphology in I-R models. In PFM there has been very little discus- 
sion of derivation and no discussion of transpositions. Concrete proposals for derivation 
and transpositions can be found in the Network Morphology model of Brown & Hippis- 
ley (2012) but it is not clear how that model would interface with syntax. Moreover, it 
is not clear how the Network Morphology model distinguishes between transpositions 
and canonical derivation, and between these and the (non-canonical) phenomenon of 
transpositional lexemes. 

A detailed set of proposals for defining lexical relatedness is given in Spencer (2013), 
where I show that there are many other types of relatedness between words in addi- 
tion to canonical inflection, canonical derivation and true (canonical) transposition. Any 
model of the lexicon has to be able to account for all these types. They include meaning- 
changing inflection, meaning-changing transposition, derivation which involves no 
change at all in FORM properties (morphologically inert derivation) and others. The con- 
ceptual problem here is that any of these types of relatedness might be part of the paradig- 
matic grammatical system in a given language, in which case the morphological means 
by which they are all expressed cannot be distinguished. Therefore, the same kind of 
machinery has to be deployed for paradigmatic derivation as for inflection. Given our 
current assumptions this means some form of paradigm function, defined in terms of 
defaults and overrides, and the challenge is therefore to ensure that the lexical repre- 
sentations so defined are compatible with the kinds of representations deployed in the 
syntax. 


7 An agenda for lexical representation 


The foregoing discussion raises more question than it answers, but the questions are 
important for lexicalist, constraints-based models generally, and for theories of lexical 
representation and morphology generally. Here, by way of a conclusion I summarize the 
main issues that have emerged. 


* Are lexemes partially specified linguistic objects? 


e What is the relationship between transpositional quasi-lexemes and canonical lex- 
emes? 


This includes Stump (2016a,b), which are concerned exclusively with FORM/CONTENT mismatches. 
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* How do we ensure that I-R morphological models can interface with constraints- 
based syntactic models, including all aspects of paradigmatically organized mor- 
phology? 


* To what extent can the morphological functions/constructions proposed in Sag 
(2012) be retained in their current form? To what extent can such constructional 
types, or their more traditional incarnations in standard HPSG, be made compati- 
ble with I-R models? 


Finally, the most difficult question of all is the oldest and the one with the widest sig- 
nificance: what kind of a thing is a dictionary entry? Is it a real, mentally represented 
linguistic construction or is it merely the convenient fiction of the lexicographer? We 
cannot address this question without providing very explicit answers to the representa- 
tional and ontological questions raised in this paper, and so I present my discussion of 
those questions as a modest contribution towards answering the much bigger question. 


Abbreviations 


ARG-ST Argument Structure (attribute) 

FS feature structure 

GPFM Generalized Paradigm Function Morphology 
HPSG  Head-driven Phrase Structure Grammar 


I-R inferential-realizational (model) 
LI Lexical/Lexemic Index 
LID Lexical Identifier 


PFM Paradigm Function Morphology 
SBCG  Sign-Based Construction Grammar 


SF semantic function (role) 
wfr Word Formation Rule 
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Chapter 13 


Troubles with flexemes 


Anna M. Thornton 
University of L’Aquila 


This paper investigates an aspect of the notion flexeme (French flexéme), introduced by Fra- 
din & Kerleroux (2003), Fradin (2003). After a brief review of how this concept developed in 
these authors’ work, and of how these authors conceive of lexemes (Section 2), the relation 
between flexemes and overabundance (Thornton 2011, 2012) is explored. Overabundance is 
introduced in Section 3, and Section 4 is devoted to some case studies, from Italian and other 
languages. It is shown that a single lexeme can map to more than one flexeme - and over- 
abundance results from this mapping. Besides, it is shown that flexemes differing from each 
other in parallel ways can have various relations with lexemes: in some cases, mapping to dif- 
ferent flexemes distinguishes two lexemes that are homophonous in their citation form (e.g., 
Italian SUCCEDERE' ‘happen’ with PST.PTCP successo and SUCCEDERE? ‘succeed’ with PST.PTCP 
succeduto), while in other cases flexemes that differ from each other in a way parallel to the 
previous one map to a single overabundant lexeme (e.g., Italian PERDERE ‘lose’ with PsT.PTCP 
perso and perduto). I conclude that the distinction between lexemes and flexemes first pro- 
posed by Fradin & Kerleroux (2003) and Fradin (2003), as well as their definition of lexeme, 
based on semantic and constructional coherence rather than on inflectional coherence, is 
useful even beyond the area of lexeme formation for which it was originally proposed. 


1 Introduction 


In a paper titled “Troubles with lexemes", Bernard Fradin and Francoise Kerleroux (2003) 
laid the bases for a critique of the commonly held notion of lexeme, drawing data from 
the realm of word-formation. They observed at the beginning of their paper: 


the lexeme is supposed to constitute one lexical unit. This unicity is guaranteed 
by inflection on the one hand and by the semantic content of the lexeme, which is 
supposed to be unique, on the other (Fradin & Kerleroux 2003: 177, emphasis mine). 


They proceeded then to show that the objects to which word-formation rules apply - 
which they propose to call lexemes, partially modifying the usual definition of this term 
- are semantically fully specified objects, that are, however, unspecified for inflection. In 
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the concluding section of that paper, they propose to distinguish three different theoret- 
ical entities: lexemes (“lexical individuals defined by the conjunction of three properties: 
category, underspecification for inflection, full specification for meaning”, Fradin & Ker- 
leroux 2003: 193), syntactic words (which are inflected, categorized, and fully specified 
for meaning), and a third entity, which they propose to call inflecteme in English and 
flexéme in French (see also Fradin 2003: 259). Objects of this third type are categorized, 
uninflected and underspecified for meaning. 

In this short contribution, I will discuss some aspects of these entities that have come 
to the fore of the debate in morphology after the publication of Fradin & Kerleroux (2003) 
and Fradin (2003). I prefer to refer to these units as flexémes, because I think that the 
intentional and witty phonological and orthographic overlap with lexéme ‘lexeme’ is 
too good to be lost, and as an hommage to the authors who first proposed this term. 
Following Fradin (forthcoming), in this paper I will use the English adaptation flexeme. 

The paper is organized as follows: Section 2 reviews the development of the concept 
of flexeme; Section 3 introduces the concept of overabundance in inflectional paradigms; 
Section 4 presents several case studies from Italian and other languages, illustrating cases 
in which a single lexeme is overabundant in one or more cells, i.e., maps to two distinct 
flexemes; Section 5 concludes. 


2 What are flexemes? 


In different contributions by Bernard Fradin, sometimes in collaboration with Francoise 
Kerleroux, the concept of flexéme/flexeme is presented differently: its coverage seems to 
have grown with time, probably in consequence of our growing understanding of the 
workings of inflectional morphology in the early years of the third millennium. 

In Fradin & Kerleroux (2003: 193) the concept seems to be equivalent to that of stem 
(in the sense, e.g., of Aronoff 1994): 


This unit [i.e., the inflecteme/flexéme] lacks semantic specification since it func- 
tions as the “inflectional stem”. 


However, the authors seem to have something more than just a single stem in mind, 
since immediately after this definition they observe: “This is correlated to the fact that 
“no semantic constraints hangs [sic] over the application of inflectional rules” (Corbin 
1987: 6)”. So the idea that flexemes have to do with instructions for building all the in- 
flected forms that realize a lexeme seems to have been present already in Fradin & Ker- 
leroux (2003). 

Fradin (2003: 259) states that 


Les flexémes [...] comportent [...] des informations relevant [...] du syntactique 
interne (les différents thémes flexionnels, sous forme hiérarchisée, s'il en existe 
plusieurs [...]). 
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So the concept of flexeme seems to have developed from being used to refer to a stem 
to being used to refer to the whole stem-set of a lexeme. In Fradin (forthcoming) a new de- 
velopment appears.! The author, dealing with verbs, distinguishes between verbs as mor- 
phological units, called “morphological verbs”, and verbs as lexical units, called “verbal 
lexemes”. He states that “[mJorphologically, a V is defined by its inflectional paradigm", 
and maintains that the two French verbs REssorTIR’ (de Y): il ressort, il ressortait...) ‘go 
out again’ and RESSORTIR? ((à Y): il ressortit, il ressortissait...) ‘come under “constitute dis- 
tinct ‘flexemes’, see Fradin & Kerleroux (2003) [...] because the set of their word-forms 
is not identical" (Fradin forthcoming: 4). 

In this passage, Fradin attributes to Fradin & Kerleroux (2003) a fully developed con- 
cept of flexeme, in which a flexeme contains all the information needed to generate all 
the inflectional forms in a paradigm: not only the information about which stem to select, 
but also inflectional class and realization rules for the different inflected forms. Roughly, 
it seems to me, a flexeme now corresponds to the entities called form paradigm and 
realized paradigm in paradigm-linkage theory (Stump 2016). Fradin (forthcoming) also 
equates the notion of flexeme with that of Paradigm Identifier adopted by Bonami & 
Tribout (2012). In turn, Bonami & Tribout (2012) state that their notion of Paradigm Iden- 
tifier “[cJaptures Fradin & Kerleroux (2003)'s notion of a flexeme: a family of lexemes 
with the same inflectional paradigm" (Bonami & Tribout 2012: slide 16).? 

Papers such as Fradin (forthcoming) and Bonami & Tribout (2012) address the question 
of how to deal with objects that are semantically different but morphologically identical, 
such as CIRAGE' ‘polishing’ and crRAGE' ‘shoe polish’, or PERLER' ‘sew beads on’ and PER- 
LER? ‘form beads on’, which share a flexeme (a form paradigm and a realized paradigm) 
but are different lexemes.? 

In this paper, on the contrary, I will explore the issue of objects that are the same 
lexeme, in the sense of Fradin & Kerleroux (2003) and Fradin (2003, forthcoming), but 
can be realized, to variable degrees, by different flexemes. 


3 Overabundance 


In recent years, attention has been drawn to the phenomenon of overabundance in inflec- 
tional paradigms (Thornton 2011, Stump 2016: 147-151). Overabundance is defined as the 
situation in which two or more forms are available to realize the same cell in an inflec- 
tional paradigm; in terms of paradigm linkage theory, one content cell has more than one 
realization. Stump (2016: 148) gives an example from English. Consider the verbs SEEM, 
MEAN, and DREAM, and the realizations of their past tense: (SEEM, {past} is realized by 
seemed, (MEAN, {past} is realized by meant, and (DREAM, {past} can be realized either by 
dreamed or by dreamt. The two (or more) forms that realize the same cell are sometimes 
called cell mates (Thornton 2011). 


1The notion of flexeme is not mentioned in Fradin & Kerleroux (2009). 

?The notion of Paradigm Identifier is clearly articulated by Bonami & Crysmann (this volume). 

3This phenomenon is labelled “homomorphy” by Stump (2016: 65): *homomorphic lexemes are lexically and 
semantically distinct but alike in every detail of their morphology”. English examples are WEAR’ ‘have on 
(an article of clothing)’ and WEAR? ‘erode’. 
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How does overabundance relate to the notion of flexemes? Does the existence of dis- 
tinct but synonymous realizations for a given content cell force us to recognize distinct 
flexemes linked to a single lexeme? 

Fradin (forthcoming) analyzes cases such as PERLER! ‘sew beads on’ and PERLER? ‘form 
beads on’ as distinct lexemes linked to the same flexeme. The case of dreamed dream. per" 
and dreamt dream. par" appears to be a mirror image of this case, with distinct flexemes 
linked to a single lexeme. The existence of such a state of affairs would be predicted in 
Fradin's theory, in which lexemes, defined as categorized and semantically fully speci- 
fied but uninflected objects, are autonomous from the flexemes that provide instructions 
for the realization of their inflected forms. Recognizing the possibility that a single lex- 
eme may be linked to two (or more) flexemes implies that a difference in inflectional 
realization cannot be invoked as one of the criteria that allow to distinguish between dif- 
ferent lexemes vs. simply different senses/acceptations of a polysemous lexeme, as was 
sometimes done in traditional discussions of the homonymy/polysemy distinction (see 
e.g. Ullmann 1957: 127-132). Indeed, flexemes that are distinct in parallel ways may map 
to a single lexeme or to distinct lexemes - where the criterion for recognizing distinct 
lexemes is semantic and constructional difference, as proposed by Fradin & Kerleroux 
(2003, 2009) and Fradin (2003, forthcoming). 

In the following section, I will review some data that show that the mapping between 
flexemes and lexemes can be of several kinds. 


4 Non-canonical mappings between lexemes and flexemes 


In this section, I will present data, mostly from well-studied cases in familiar languages, 
that show how one and the same difference in inflectional realization may map either to 
distinct lexemes or to a single overabundant lexeme. 


4.1 Case study 1: Noun plurals 


Nouns in which apparently more than one plural form pairs with a single singular form 
are very easy to find in language descriptions. Usually authors assume, at least implicitly, 
the admittedly vaguely defined criterion of ‘difference in meaning’ to decide whether 
specific cases represent distinct lexical items with homophonous singular forms or a 
single lexical item which is overabundant in its plural cell(s). Besides, since data are 
usually found in works which aim at description rather than at theoretical analysis, often 
authors leave the matter undecided, because it is not necessary for descriptive purposes 
to establish whether a certain case is an instance of homonymy or polysemy; on the 
other hand, cases in which no semantic distinction is observable between two or more 
different plural forms are usually highlighted by authors of descriptions. 


^Remember also the observation by Fradin & Kerleroux (2003: 177) quoted in Section 1, that unicity of a 
lexeme "is guaranteed by inflection" as well as by the semantic content. 
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Cases such as the English and Breton ones in (1) and (2) are typical: 


(1) English (Aronoff 2000: 347) 


a. sG brother PL brothers 
‘male sibling’ 
b. sc brother PL brethren 


‘fellow member of a profession, society or sect 
(2) Breton (Trépos 1980) ? 


a. SG eskob PL eskibien 
‘bishop’ 

b. sc eskob PL eskobou 
'kingpin'é 


In these cases most authors argue that the meanings of the two items are sufficiently 
distinct to allow us to consider them as distinct lexemes, which happen to be homophon- 
ous in their singular form.’ In these cases, then, we have a 1:1 mapping between lexemes 
and flexemes, with the extra quirk represented by the fact that two distinct flexemes 
have homophonous singular forms. 

However, by perusing the whole description of Breton noun plural offered by Trépos 
(1980), we discover that ‘bishop’ can have as many as three different plural forms (3a), 
and the same is true for ‘coat’ (3b): 


(3) Breton (Trépos 1980: § 149) 


a. SG eskob PL eskibien/eskobed / eskeb 
‘bishop’ 
b. sc mantell ex mentell/mentellou/mentilli 


‘coat’ 


A similar situation is common in Modern Standard Arabic, where nouns often have 
several plural forms; authors of descriptions usually comment on when they would pre- 
fer to assign the different plural forms to distinct lexemes, on the basis of a clear distinc- 
tion in meaning, as in (4a vs. 4b, 4c vs. 4d), and when the different plural forms can be 
used interchangeably, and must be recognized as realizing the same lexeme, as in (5a-5b). 


Breton nouns inflect only for number. 
The French gloss given by Trépos (1980: 73) for eskibien is ‘chevilles d’attelage’. 
"Even if (1b) obviously derives from (1a) by means of a metaphorical extension. 
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(4) Modern Standard Arabic (Holes 2004, Kaye 2007)? 


a. sG bayt PL buyu:t 
‘tent’, ‘house’ 
b. sc bayt PL ?abyait 
‘verse of poetry’ 
c. sG maktab PL maka:tib 
‘office’ 
d. sc maktab PL maktaba:t 
‘library’, bookshop’ 
(5) Modern Standard Arabic (Kaye 2007) 
a. sG fayn PL fafyun/fuyun 
eye 
b. sc sáriq PL sáriqun, saraqa, surraq 
‘thief’ 


With respect to nouns such as those in (5), Kaye (2007: 234-235) observes that “[t]here 
are many nouns with two or more plural variants without any difference in meaning”, 
while on the nouns in (4a-4b) he states that “[i]t is best to regard [...] bayt as distinct 
lexemes” (Kaye 2007: 234). 

Authors like Kaye rely on meaning distinction as the only criterion for distinguishing 
between lexemes, and (implicitly) accept the possibility that what they conceive of as 
single lexemes (like the ones in (5)) may have overabundant realizations in one or more 
cells, i.e., may map to more than one flexeme. Other authors, however, reject this possi- 
bility, and assume that a difference in inflectional realization (a difference in flexemes) 
must always correspond to a difference in lexemes. A champion of such a position is 
Paolo Acquaviva, who has articulated his point of view in his work on Italian double 
noun plurals (Acquaviva 2008). 

Italian nouns have inherent gender (with two values: feminine and masculine) and 
inflect for number (with two values: singular and plural). About 20 Italian nouns are 
usually described as overabundant in the plural (e.g., in traditional reference grammars 
such as Battaglia & Pernicone 1954). These nouns have a singular form in -o which is mas- 
culine, a plural form in -i which is masculine, and a plural form in -a which is feminine. 
Some representative examples are given in (6): 


(6) Italian (Acquaviva 2008, Thornton n.d.) 


a. SG braccio PL braccia/bracci 


‘ H 


arm 


8MS Arabic nouns inflect for number (singular, dual, plural), case (nominative, genitive, accusative, with a 
syncretism of genitive and accusative (sometimes called oblique) in non-singular forms), and definiteness 
(definite, indefinite). In systems in which nouns inflect for other features besides number, if multiple forms 
with the same number value exist they are predicted to exist in all cells; e.g., in Arabic, multiple plural 
forms are predicted to exist in all case and definiteness values. 
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b. sG corno PL corna/corni 
‘horn’ 

c. SG ginocchio PL ginocchia/ginocchi 
‘knee’ 

d. sc membro PL membra/ membri 


‘limb’/‘member’ 


Acquaviva’s position is that plurals in -a, independently of whether they differ in 
meaning from the plurals in -i with which they share a root, are distinct lexemes, pluralia 
tantum, derivationally related to the lexemes in -o/-i with which they share a root: 


plurals in -a [...] are lexical plurals: distinct, inherently plural nouns, related to 
the base noun by a word-formation process. (Acquaviva 2008: 123, emphasis mine) 


Braccia ‘arms’ is not the plural of braccio ‘arm’; it is an inherently plural lexeme, 
derived from the same root as braccio/bracci (Acquaviva 2008: 157, emphasis mine) 


He brings forward several arguments for his position, which are reviewed in Thornton 
(n.d.: 430-438), where it is shown that one of them (agreement with conjoined singular 
NPs) is based on a misunderstanding of the workings of Italian agreement resolution 
rules, and can be dismissed as irrelevant. His other arguments will be illustrated here. 

The first argument is purely metatheoretical. Acquaviva states it as follows: 


The simple fact that a number of plurals in -a do not block their regular alternants 
in -i is enough to prove the point, if we take seriously inflectional disjunctivity 
(Acquaviva 2008: 145, emphasis mine). 


This argument boils down to positing as a theoretical requirement the non-existence of 
overabundance, or the impossibility of a single lexeme to map to distinct flexemes. Such 
a choice eliminates the problem we are investigating by denying its existence, rather 
than by offering a solution. However, if we assume, as done in the canonical approach 
to morphological typology (Corbett 2005, 2006, 2007), that inflectional disjunctivity and 
lack of overabundance are only canonical properties of lexemes, rather than inviolable 
theoretical requirements, the problem reappears and requires to be investigated. 

Another argument put forward by Acquaviva to establish that plurals in -a are distinct 
lexemes from their co-radicals in -0/-i is consonant with Fradin & Kerleroux's (2003) view 
of lexemes: Acquaviva observes that some plurals in -a appear to be the bases of word- 
formation processes. An example would be cornificare ‘to make a cuckold of’, which Ac- 
quaviva analyzes as derived from corna ‘horns’ (6b); cornificare is synonymous with the 
idiom fare/mettere le corna ‘to make a cuckold of, lit. to make /put horns.r.Pr', which is 
never realized by “fare/mettere i corni, with ‘horns.M.r1’. On this basis, one can presume 
that corna, and not corni, is the base of cornificare. However, the idiom fare/mettere un 
corno 'to make a cuckold of, lit. to make/put a horn.M.sc' is also attested, so one cannot 
exclude that the base of cornificare is a non-defective lexeme corno/corna, rather than 
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a plurale tantum defective noun corna. In any case, this argument boils down to recog- 
nizing different lexemes when there is a difference in semantics and in the possibility 
of appearing in certain constructions, as proposed also by Fradin & Kerleroux (2003, 
2009), Fradin (2003). This is orthogonal to the question whether a lexeme, defined on 
the basis of its semantics and distribution in constructions, can be overabundant in one 
or more cells. If we show that two plural forms appear in the same set of environments 
and constructions, they must be recognized as belonging to the same lexeme (unless, 
like Acquaviva, one wants to posit a difference of inflectional realization as sufficient 
for recognizing distinct lexemes, regardless of the equal semantics and distribution of 
the forms). Thornton (2010-2011) has shown, by means of corpus-based evidence, that in 
some cases two plurals in -i and -a are used interchangeably in the same context, and 
cannot therefore be considered as instances of distinct lexemes in Fradin & Kerleroux 
(2003)’s sense. This is the case for ginocchi /ginocchia ‘knees’ (6c), both of which appear 
interchangeably (as well as the singular form ginocchio) in a number of syntactic envi- 
ronments (Thornton n.d.: 465). In the case of membra and membri (6d), instead, there is 
evidence to posit two distinct lexemes, MEMBRO’ ‘limb (body part)’, which is [-human], 
and MEMBRO? ‘member (of a committee, organization, etc.)’, which is [+human], and is 
obviously derived from MEMBRO! by means of a metaphoric extension. MEMBRO? is not 
overabundant: its plural is always membri, and it is the base of a derived feminine MEM- 
BRA ‘female member (of a committee, organization, etc.)’, PL membre (Thornton 2014). 
MEMBRO! isn't overabundant either: its plural is membra ‘limbs’; however, contrary to 
Acquaviva's analysis, it is not defective: the singular membro in the sense of ‘limb, body 
part’ is attested (cf. Thornton n.d.: 463, fn. 38). These examples show that each case in 
which we observe, in Italian, a feminine plural in -a and a masculine one in -i based on 
the same root, must be analyzed in its own right: the parallelism in the flexemes does not 
guarantee a parallelism in the lexemes. Membri and membra belong to different lexemes 
(defined according to Fradin & Kerleroux's (2003) and Fradin's (2003) semantic criteria), 
while ginocchi and ginocchia belong to the same lexeme - if we admit the possibility 
of overabundance, i.e. of a single lexeme mapping to more than one flexeme. The case 
of bracci and braccia is particularly complex: these very frequent forms, if submitted to 
Fradin & Kerleroux's (2003) and Fradin's (2003) criteria for the recognition of distinct 
lexemes, map to several semantically distinct lexemes, some of which are overabundant 
in the plural (e.g., ‘arm (body part)’), while others select only one plural form (e.g., ‘ell 
(measure of length)' selects braccia). Again, the mapping between lexemes and flexemes 
is not 1:1, as shown in Figure 1. 


Lexeme BRACCIO! ‘arm (body part)’ ( Flexeme braccio/bracci ) 


( Lexeme BRACCIO? ‘ell (measure of length)’ yJ Flexeme braccio/braccia 


Figure 1: Mapping between two lexemes and two flexemes in Italian 
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4.2 Case study 2: Past participles 


Another area in which mapping between semantically defined lexemes and flexemes 
is not always 1:1, and in which differences in flexemes do not invariably coincide with 
differences in lexemes, is verbal inflection. 

In some cases, two semantically and constructionally distinct lexemes have quite dif- 
ferent realized paradigms, even if the citation forms coincide. A case in point is that of 
Italian SUCCEDERE' ‘happen’ and sUCCEDERE? ‘succeed’. SUCCEDERE’ ‘happen’ is an imper- 
sonal verb, which is used only in 3rd person forms; its PST.PTCP is successo. SUCCEDERE? 
‘succeed’ is a bi-argumental verb; its second argument is introduced by the preposition 
a ‘to’; it has a full set of realized forms, and its PST.PTCP is overabundant, according to 
various authoritative sources (Zingarelli 2016, Serianni 1988): it can be either succeduto 
or successo. The forms are shown in (7): 


(7) Italian (Zingarelli 2016, Serianni 1988, personal knowledge) 
lexeme PST.PTCP 
SUCCEDERE' ‘happen’ successo 
SUCCEDERE? ‘succeed’  successo/succeduto 


From (7) it would appear that sUCCEDERE! maps to a single flexeme, while SUCCEDERE? 
maps to two. However, for SUCCEDERE? the form succeduto is prescribed over successo 
by normatively oriented sources like Serianni (1988: $ 316), and the most recent example 
of successo as a form of sUCCEDERE' cited by Serianni (1988) is from a novel published in 
1960. Investigation of contemporary usage in corpora is difficult for practical reasons: 
successo has 87763 tokens in the corpus la Repubblica 1985-2000 (380M tokens; I will 
consider data from this corpus as representative of contemporary Italian usage of suc- 
cesso and succeduto), making it impractical to examine each token to assign it to either 
SUCCEDERE! or SUCCEDERE’. Besides, successo is a homonym of the sc form of the noun 
SUCCESSO 'success'. However, manual examination of the first 200 random tokens of the 
string successo a, which corresponds to both ‘happened to’ and ‘succeeded to’, suggests 
that in this context successo always realizes SUCCEDERE’ ‘happen’, while, as expected, all 
the 374 tokens of succeduto in the corpus la Repubblica 1985-2000 realize SUCCEDERE? ‘suc- 
ceed’. So, as far as the PST.PTCP is concerned, it appears that in contemporary Italian the 
two lexemes SUCCEDERE’ ‘happen’ and succEDERE? ‘succeed’ map to different flexemes.? 

We can compare this situation with that of the verb PERDERE ‘lose’, which is genuinely 
overabundant in its PST.PTCP, as shown in (8): 


?Things are more complicated with the simple past, which is (exemplifying with 3sc forms) successe for 
SUCCEDERE! and overabundant for SUCCEDERE? (successe/ succedette; a third form, succedé, is theoretically 
possible as ‘succeed.pst.3sc’, but it is not attested in the corpus la Repubblica 1985-2000). Successe has 1263 
tokens and succedette 43 tokens in this corpus; all tokens of succedette realize SUCCEDERE? ‘succeed’; manual 
examination of the 14 tokens of the string successe a ‘happened to/succeeded to’ reveals that in most cases 
it realizes SUCCEDERE' ‘happen’, but in 2 cases successe realizes SUCCEDERE? ‘succeed’, confirming that this 
verb is overabundant in its simple past. However, the simple past does not belong to the native grammar 
of many speakers of Italian, for whom it is a learned form; so it is unwise to draw strong conclusions from 
these data. Overabundance in the simple past in Italian shall be left for further research. 
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(8) Italian (personal knowledge) 
lexeme PST.PTCP 
perdere ‘lose’ perso/perduto 


Speakers appear unaware of conditions regulating the selection of either one of the two 
forms, to the point that many speakers asked the Accademia della Crusca's linguistic 
consulting service for advice on when to use each form (Thornton 2016). Speakers seem 
convinced that rules that govern a complementary distribution of the two forms should 
exist, but indeed the distribution of the two pst.prcp forms is not complementary: they 
can be used interchangeably in many contexts, including idioms, as shown in (9a-b) and 
already shown by Thornton (2011: 369); the only case in which only one form is used is 
in titles of works of art (9c). Representative data, with frequencies from the corpus la 
Repubblica 1985-2000 when relevant, are presented in (9). 


(9) Italian (Thornton 2011: 369, Thornton 2016, personal knowledge) 


a. occasione perduta 291 / occasione persa 83 
‘a chance lost’ 

b. perso la guerra 109 / perduto la guerra 32 
‘lost the war’ 

c. I predatori dell'arca perduta/*persa 
‘Raiders of the lost ark’ 

d. Alla ricerca del tempo perduto/*perso 


A la recherche du temps perdu by Proust, literally ‘In search of lost time’; 
English translation’s title ‘Remembrance of things past’ 


e. Paradiso perduto/"perso 


‘Paradise lost’ 


This case study shows again a case in which similar differences in flexemes do not map 
in a parallel way to differences in lexemes: while succEDERE’ ‘happen’ and SUCCEDERE? 
‘succeed’ map to distinct flexemes, in which the pst.ptcp forms are successo and succeduto 
respectively, PERDERE ‘lose’ maps to two flexemes, distinct from each other in a way 
parallel to the flexemes suCCEDERE’ and SUCCEDERE’, and its PsT.PTCP can be realized by 
both perso and perduto. 


4.3 Systematic overabundance and overabundance in all cells 


The two case studies illustrated above have shown examples in which there is an over- 
abundant cell in the form paradigm and the realized paradigm of certain lexemes (such 
as Italian BRACCIO’ ‘arm’ and PERDERE ‘lose’). Technically, this should be enough to rec- 
ognize that such lexemes map to distinct flexemes. However, if one wished to take into 


312 


13 Troubles with flexemes 


account quantitative considerations, one might want to deal with these cases by recog- 
nizing a minor “exception”, and still posit a single flexeme with a single exceptional, 
overabundant cell. 

However, overabundance is not always confined to a single cell. In this section I will 
illustrate cases of “systematic overabundance” (Bonami & Stump 2016: 469), in which en- 
tire slabs or subparadigms are involved,!° and cases of overabundance in all cells. These 
cases definitely deserve consideration in the context of exploring the possible deviations 
from a 1:1 mapping between lexemes and flexemes. 

A particularly clear example of systematic overabundance is found in Spanish, where 
all verbs have two complete sets of forms, built by means of different endings, in the 
Imperfect Subjunctive, as shown in Table 1 for the verb haber ‘have’. 


Table 1: Imperfect Subjunctive of Spanish haber ‘have’ 


-ra set -se set 
1sG/3sG hubiera hubiese 
2SG hubieras hubieses 
1PL hubiéramos hubiésemos 
2PL hubiérais hubiéseis 
3PL hubieran hubiesen 


Despite a suggestion by Bolinger (1956) that there is some subtle semantic difference 
between the two sets of forms, contemporary descriptions agree that “these two sets of 
forms are interchangeable” (Butt & Benjamin (2000: 167); see also Rojo & Veiga (1999: 
2910): “las formas en -ra y -se son hoy por hoy perfectamente equivalentes”). Spanish 
verbal lexemes, then, appear to systematically map to two flexemes, which are distinct 
in the Imperfect Subjunctive forms - unless one wants to build overabundance within 
the definition of Spanish verbal flexemes, exactly because of its systematicity. 

In other cases, however, we encounter overabundance in all cells of a given lexeme, but 
this is not systematic across all the lexemes within that part of speech in the language; 
therefore, the possibility of building overabundance in the definition of the flexemes to 
which these lexemes map is not viable, and we must recognize a 1:2 mapping between 
lexemes and flexemes. 

A case in point is that of the Italian noun oRECCHIO ‘ear’. This noun can be described 
as overabundant in all its cells: it has two sc forms and two Pr forms, as shown in (10): 


10The notion of slab has been introduced by Carstairs (1987: 81), who defines it as “a subset of the macroinflex- 
ions within one paradigm consisting of all the macroinflexions which are associated with some specified 
morphosyntactic property”. His examples from Latin noun paradigms are the singular slab (all singular 
case-forms) or the genitive slab (GEN.SG and GEN.PL). The notion of sub-paradigm is used in a variety of 
senses, most commonly by scholars with a background in Slavonic languages. It aims at capturing sub- 
sets of cells in a paradigm which share more than just one feature value, such as verb tenses (the Present 
Indicative, the Present Subjunctive, etc.). 
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(10) Italian (personal knowledge) 
lexeme ORECCHIO 
sG forms  orecchio(M)  orecchia (F) 
PL forms  orecchi(M) ` orecchie (F) 


Of course, one could posit two distinct lexemes, ORECCHIO(M) and ORECCHIA(F), on the 
basis of the difference in gender, which is canonically an inherent fixed feature value in 
nouns. However, we already know from the cases discussed in Section 4.1 that Italian 
has nouns which change their gender value from the singular to the plural. Besides, ac- 
cording to Fradin's (2003) and Fradin & Kerleroux's (2003) definition of lexeme, which 
recognizes a single lexeme on the basis of identity of meaning and constructional distri- 
bution, the different forms in (10) appear to belong to the same lexeme, since they can 
be used interchangeably in the same contexts, even in idioms (11a-11b), as shown by the 
examples in (11): 


(11) Italian (personal knowledge; frequency data from the corpus la Repubblica 
1985-2000) 
a. fare orecchi da mercante 18 / orecchie da mercante 139 
‘to turn a deaf ear’ lit., to do merchant's ears 
b. dare una tirata d'orecchi 122 / tirata d'orecchie 92 
‘to give a dressing-down' lit., to give a tug of ears 
c. occhi e orecchi 19 / occhi e orecchie 68 
'eyes and ears' 
d. da un'orecchia all'altra 2 / da un'orecchio all'altro 13 


‘from one ear to the other’ 


So Italian ORECCHIO can be analyzed as a single lexeme mapping to two flexemes, as 
shown in Figure 2. 


C Flexeme orecchio / orecchi (M) à) 


( Lexeme ORECCHIO ‘ear’ 


( Flexeme orecchia / orecchie (F) 


Figure 2: Mapping between one lexeme and two flexemes in Italian. 


The flexemes are distinct; they instantiate nouns of different inflectional classes, while 
most Italian noun lexemes map to only one flexeme, belonging consistently to only one 
gender and one inflectional class, as shown by the examples in Table 2. 

Lexemes such as BRACCIO’, GINOCCHIO and ORECCHIO are non-canonical, in that they 
map to more than one flexeme, as seen above. 
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Table 2: Italian (personal knowledge). 


lexeme flexeme gloss 

SG PL 
OCCHIO (M) occhio occhi ‘eye’ 
BOCCA(F) bocca bocche ‘mouth’ 
MANO(F) mano mani ‘hand’ 


13 Troubles with flexemes 


The last case of non-canonical mapping between lexemes and flexemes that I will 
examine is that of certain Italian verbs, that are described as able to inflect according to 
two different conjugations; these are called “verbi sovrabbondanti" by Serianni (1988). 

Grammars usually address together two kinds of such verbs: those in which the dif- 
ference in conjugation does not bring along a difference in meaning (12a), and those in 
which the difference in inflectional class goes hand in hand with a difference in meaning 


(12b). 


(12) Italian (Serianni 1988, personal knowledge) 


a. 


i. 


ii. 


iii. 


iv. 


ii. 


iii. 


iv. 


vi. 


adempiere/adempire 

‘fulfil’ 

compiere/compire 
‘complete’ 

empiere/empire 

‘all 

riempiere/riempire 

‘all 

abbonare/abbonire 
'subscribe'/'appease' 
arrossare/arrossire 

‘make red’, ‘dye red’/‘redden’, ‘flush’ 
fallare/fallire 

‘make a mistake’ /‘fail’ 
imboscare/imboschire 

‘hide [in a wood]’/‘afforest’ 
impazzare/impazzire 

‘be in full swing’/‘go crazy’ 
sfiorare/sfiorire 


‘brush’, ‘graze’/‘wither’, ‘wilt’ 
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Serianni (1988), from which the examples in (12) are taken, considers the cases in (12a) 
and (12b) as two groups of overabundant verbs, while Dardano & Trifone (1985) consider 
only cases (12a) as overabundant verbs, and propose that cases in (12b) are best analyzed 
as distinct lexemes; I concur with Dardano & Trifone, because of a clear difference in 
meaning between the two verbs in each pair in (12b); these verbs are different lexemes 
according to Fradin’s (2003) and Fradin & Kerleroux’s (2003) criteria, and will not be 
further discussed here. 

Verbs in (12a) are claimed to have forms belonging to the two inflectional classes tradi- 
tionally called 2^4 conjugation (infinitive ending in -ere) and 3" conjugation (infinitive 
ending in -ire); besides, the 3rd conjugation forms belong to the subclass of 3'd conju- 
gation verbs which does not exhibit the element -isc- in the appropriate morphomic 
partition (so PRS.IND.1SG is empio, not *empisco, etc.). The SCH conjugation and the -isc- 
less subclass of the 3!d conjugation have non-distinct inflection in several cells, listed in 
(13a), while they have distinct forms in other cells, listed in (13b), with examples from 
riempiere and riempire: 


(13) Italian (personal knowledge) 


a. Cells with non-distinct realization for the verbs in (12a) 
Present Indicative: all person/number forms, except 2PL 
Present Subjunctive: all person/number forms 
Imperative 2sG 
Gerund 
(Present Participle)!? 

b. Cells with distinct realization for the verbs in (12a) 

Present Indicative 2PL = Imperative 2PL (e.g., riempiete vs. riempite) 
Imperfective Past Indicative (Imperfetto): all person/number forms (e.g., 15G 
riempievo vs. riempivo, etc.) 

Simple Perfective Past Indicative (Passato Remoto): all person/number forms 
(e.g., 156 riempietti or riempiei vs. riempii, etc.) 

Future: all person/number forms (e.g., 156 riempieró vs. riempiro, etc.) 
Imperfect Subjunctive: all person/number forms (e.g., 1s6 riempiessi vs. 
riempissi, etc.) 


Un (13) I consider only synthetic forms; periphrastic forms are formed by an inflected auxiliary followed by 
a Past Participle, so their distinctness is a function of the distinctness of the Past Participle form (therefore, 
they are always distinct for these two conjugations). 

12A so-called Present Participle ending in -nte is normally listed as part of a verb's paradigm in Italian de- 
scriptive grammars, but it is extremely doubtful that such a cell should be recognized as a genuine part of 
verbal paradigms in Italian. Haspelmath (1996) contrasts these so-called present participles of Italian with 
those of other languages in terms of their syntactic properties (government of subject and non-subject 
arguments) and concludes that in Italian "active participles do not exist" (Haspelmath 1996: 61). Luraghi 
(1999) is less drastic, but shows that -nte forms have never been part of the spoken register in the history 
of the language, and that a verbal usage of -nte forms is only attested in some technical or bureaucratic 
registers, while adjectives and nouns in -nte, often unrelated to any verbal base, are common. 
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Present Conditional: all person/number forms (e.g., 18G riempierei vs. 
riempirei, etc.) 

Past Participle (e.g., riempiuto vs. riempito) 

Infinitive (e.g., riempiere vs. riempire) 


The verbs in (12a) are technically cases of single lexemes mapping to two distinct 
flexemes, but these flexemes are syncretic in all the cells listed in (13a). 

As I am always wary of believing statements by grammars on the distribution of cell 
mates, I have checked the distribution in the corpus la Repubblica 1985-2000 of the forms 
ofthe verbs in (12a) that are distinct in the two conjugations. Table 3 illustrates the results 
(figures for forms of the same Tense/Mood have been added together). 

The data in Table 3 show the following picture: EMPIERE/EMPIRE ‘fill’ are almost ex- 
tinct verbs in both conjugations, totaling only 13 forms overall; their meaning is nor- 
mally expressed, in contemporary Italian, by RIEMPIRE; RIEMPIERE ‘fill’ is little used — 
there are a few tokens of the Infinitive and of the Imperfective Past Indicative (Imper- 
fetto) in usage, but the ratio between forms of RIEMPIRE and forms of RIEMPIERE in the 
cells for which the two conjugations have distinct forms is so unbalanced (504:1) that 
the two verbs represent at best an extremely weak and non-canonical case of overabun- 
dance (or mapping from one lexeme to two flexemes) according to Thornton's (2012: 
188-189) criteria for measuring the strength of overabundance on the basis of frequency 
ratios between two cell mates. ADEMPIERE and ADEMPIRE ‘fulfil’ have a less unbalanced 
frequency ratio (15.2:1) overall, but it must be observed that 99.5% of the forms of ADEM- 
PIERE are realizations of the Infinitive and the Past Participle, while 93.3% of the forms 
of ADEMPIRE are realizations of tenses different from the Infinitive and the Past Partici- 
ple. Indeed, all the Past Participle forms are 9nd conjugation forms (i.e., they are forms 
of ADEMPIERE, not possible forms of ADEMPIRE), so there is no overabundance in this 
cell; the only tenses in which the two verbs display some overabundance are the Fu- 
ture (with a ratio of 5.4:1 in favour of ADEMPIRE) and, very marginally, the Infinitive 
(with a very unbalanced ratio of 154:1 in favour of ADEMPIERE). The same picture, even 
more dramatically, is presented by COMPIERE/COMPIRE ‘complete’. Assessment of over- 
abundance in this case is made difficult by the fact that some Past Participle forms of 
COMPIRE are homographous with other forms in the paradigm, and/or with forms of the 
noun COMPITO ‘task, homework’, and/or of the adjective coMPrrO 'corteous, polite’, and/ 
or of the verb COMPITARE ‘spell out’ (e.g., compito represents 'complete.PsT.PTCP.M.sG', 
"task(M).sc', 'courteous.M.sc' and 'spell out.PRs.IND.1sc"; the noun for ‘task’ and the 1sc 
form of ‘spell out’ have antepenultimate stress, while the other forms have penultimate 
stress, but stresses on these syllables are not marked in the standard orthography of 
Italian, so all the forms are homographs even if they are not all homophonous); these 
homographies have been manually disambiguated for the forms ending in -a and -e (com- 
pita 'complete.PsT.PTCP.F.sG', ‘courteous.F.sG’, spell out.PRs.IND.3sc' and compite 'com- 
plete.PsT.PTCP.F.PT/", ‘complete.PRS.IND.2PL’, ‘courteous.F.PL’), which have low frequency, 
thus making manual disambiguation practical; the lack of manual disambiguation for 
the high frequency forms in -o and -i explains why the exact frequency of these forms is 
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not given in Table 3, and a question mark has been inserted instead.’ The actual forms 
realizing ‘complete.psT.pTcp.F.sG’, and ‘complete.PsT.PTCP.F.PL’ turned out to be a minor- 
ity (3 over 48 (6%) for the r.sc form, 1 over 9 (11%) for the r.Pr form). Therefore, it may be 
concluded with some safety that in the Past Participle cell the verb COMPIERE is favoured 
and COMPIRE is quite underrepresented. These two verbs show the same kind of “divi- 
sion of labour” already observed for ADEMPIERE and ADEMPIRE: COMPIERE specializes for 
the Infinitive and the Past Participle, and compire for all other tenses (among the ones 
that have distinct realizations for the two conjugations); however, in most tenses a few 
forms of COMPIERE are also attested, so COMPIERE/COMPIRE represent the best example of 
overabundance in all cells encountered so far among the Italian verbs commonly dubbed 
“sovrabbondanti” (although the frequency ratios render this case of overabundance not 
very canonical). It seems that ADEMPIERE/ADEMPIRE and COMPIERE/COMPIRE are on their 
way from overabundance to heteroclisis: at some point in the future, we might observe 
a lexeme with finite synthetic forms belonging to the 3'4 conjugation and Infinitive and 
Past Participle (which carries with it all the periphrastic forms) belonging to the 2nd con- 
jugation. RIEMPIERE/RIEMPIRE, instead, is just reducing overabundance in favour of the 
3rd conjugation forms, and is quite advanced in this process. 

If the process leading to heteroclisis is completed, we will have a single lexeme map- 
ping to a single heteroclitic flexeme. At the moment, however, we have a number of 
Italian verbal lexemes that map to two flexemes, at least in parts of their paradigm. 


5 Conclusions 


The data illustrated in this paper show that the distinction between lexemes and flexemes 
first proposed by Fradin & Kerleroux (2003) and Fradin (2003), as well as their definition 
of lexeme based on semantic and constructional coherence, is useful even beyond the 
area of lexeme formation, for which it was originally proposed. A separation between 
lexemes and flexemes, like the separation between content paradigms, form paradigms 
and realized paradigms adopted in paradigm-linkage theory, is a useful tool in models 
of morphological analysis that recognize a level of autonomous morphology. 
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In this chapter, we shed new light on the reduplicative processes of Mandarin Chinese and 
assess the structural and interpretive properties of the input/base and output of these word 
formation phenomena. In particular, we focus on the categorial status of the base and ad- 
dress the issue of whether reduplication applies to category-free roots or full-fledged lex- 
emes. Empirically, the privileged domain of research is increasing reduplication of disyllabic 
bases, or, as we dub it in the chapter, the AABB pattern, which is compared with diminishing 
reduplication, expressed by the template ABAB. The comparison between the two phenom- 
ena allows us to show that increasing and diminishing reduplication differ in the nature of 
the input units involved. On the grounds of a wide-ranging class of data, we argue that Man- 
darin reduplication takes base units of different ‘size’: word/lexeme-like units provided with 
category, namely verbs in the case of diminishing reduplication, and categoryless roots in 
the case of increasing reduplication. Throughout the chapter, we explore some category neu- 
tral properties of increasing reduplication and propose a unitary semantic operation capable 
to derive the various interpretive nuances of this phenomenon across lexical categories. 


1 Introduction 


1.1 Lexemes vs. words and reduplication phenomena 


Lexemes are usually understood as sound/meaning pairs, i.e. linguistic signs provided 
with lexical category specification yet lacking inherent inflectional specification. Lex- 
emes and words are thus considered as distinct entities in lexicalist approaches to word 
formation. As a matter of fact, while a word proper is a fully inflected entity functioning 
as a syntactic atom, a lexeme is the abstract version of the word-form lacking inflectional 
marking (Fradin & Kerleroux 2003). As put forward by Fradin & Kerleroux (2003), the 
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form of the lexeme can either be segmentally simple (viz. a root) or complex (viz. a stem), 
with affixal derivation, compounding and reduplication as phenomena possibly involved 
in lexeme formation. 

Reduplication phenomena, however, are particularly challenging under this approach, 
since cross-linguistically the functions of reduplication are very varied and difficult to 
place categorically within the derivational domain of lexemes. In fact, whereas derivation 
typically forms new lexemes and can be category changing, reduplication often conveys 
values typically found in the inflectional domain. Although reduplication is attested with 
a variety of meanings (and forms) across languages, this phenomenon is consistently as- 
sociated with its prototypical (iconic) function of intensification. In its increasing value, 
reduplication in the nominal domain gives as a result plural nouns, and in the domain 
of verbs it usually conveys aspectual meanings, i.e. pluractionality, iterative or progres- 
sive aspect, which are features prototypically expressed by inflection markings in most 
Indo-European languages. With adjectives, the prototypical value is intensification of 
the property/quality expressed by the base adjective. Nevertheless, independently of its 
semantic values, reduplication manifests several properties of word/lexeme formation 
and, formally, approaches derivational phenomena. First of all, (full) reduplication con- 
sists in the iteration of simple or complex roots (viz. stems), since it may also involve 
complex objects, such as compounds. Crucially, however, it typically applies to unin- 
flected bases, with inflectional marking, if any, applying outside of/after reduplication. 
Moreover, reduplication shows many properties of compounding, since it often induces a 
reanalysis of the stress or tonal pattern of its base, or the insertion of epenthetic material 
between the two iterating units and/or some other phonological readjustment. Further, 
semantic drift and idiosyncrasy can characterize the outputs of reduplicative processes, 
while inflection phenomena are very transparent at the interpretive level (see Forza 2011, 
for an enlightening typological perspective) . 

Therefore, under the lexeme/word distinction approach, we could argue that redupli- 
cation applies to roots or stems (traditionally understood as the phonological form of 
lexemes) and its domain of application is below the level of the word, or below X° in the 
standard X-bar approach. 


1.2 Words, lexemes, and roots/stems in Mandarin Chinese 


If the concept of lexeme appears empirically motivated in fusional or agglutinating lan- 
guages whereby inflection markers modify the word form conveying relevant features 
in the syntactic contexts, its motivation is less grounded in isolating languages, where 
(concrete) words occur with none or a very low number of inflection markers, typically 
show invariable form and are virtually indistinguishable from the corresponding (ab- 
stract) lexemes. Mandarin Chinese is one of those languages where words have little 
or no inflection and where lexemes, expressing the abstract representation of a word, 
cannot be distinguished from word forms on a formal basis. 

In Mandarin, the crucial distinction at the morphological level lies in the bound or free 
status of the root (a lexical morpheme), i.e. whether the root can ‘stand alone’ and occupy 
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a syntactic slot (1), equating thus free standing words in fusional languages, or whether 
it must be formally conjoined with another bound or free root, or with a derivational 
affix, to form an autonomous lexeme/word (2). 


(1) free roots: $i mão ‘cat’, À zou ‘walk, run away” 


(2) bound roots: X yi ‘clothing, clothes’ , EX Ou ‘beat’ 


While the roots in (1) can be used by themselves in a sentence, those in (2) cannot stand 
alone but occur in complex words like e.g. XX da-yi ‘big-clothes, overcoat, topcoat’, 
FX yü-yi ‘rain-clothes, raincoat’, *<## yi-gui ‘clothes-cupboard, wardrobe’, X$ yi- 
gou ‘clothes-hook, clothes hook’ (Arcodia & Basciano 2017: 105-106). Due to a strong 
tendency towards disyllabification attested in the evolution of the Chinese language over 
the centuries (see Shi 2002: 70-72), most roots are nowadays bound in Standard Mandarin 
(about 70% according to Packard 2000). Therefore, the majority of words or lexemes are 
compounds or other types of morphologically complex forms, typically ranging over all 
major lexical categories. 

Another crucial aspect of Chinese morphology lies in the absence of strictly morpho- 
logical criteria for the identification of the lexical category of roots (or stems, if mor- 
phologically complex), with some exceptions.! As a matter of fact, no category-specific 
morphology (such as declension/conjugation class markers in fusional languages) can 
be deployed to partition roots into lexical classes, with a verb like XE zou ‘walk, run 
away’ being virtually indistinguishable at the morphological level from a noun like = 
shu ‘book’ (see Basciano 2017). Since there are no reliable morphological criteria to iden- 
tify lexemes as roots (or stems) endowed with lexical category features, the only reliable 
criterion is the distributional one. For instance, syntactic distribution only can discrim- 
inate among the adjectival, verbal or nominal use of a stem (namely, a combination of 
two roots) like MiX máfan ‘annoying, bother, trouble’ (examples below from Basciano 
2017: 561-562): 


(3) a. 3HTESEARJBEAR © 
zhé jian shi hén mafan 
this cp fact very troublesome 
‘This fact is troublesome’ 
b. AREA IL A ° 
tá  bu-yuan mafan biérén 
38G.M not-willing trouble others 


‘He is unwilling to trouble other people’ 
c. TRI MERE E838 8 — Sine 9 
ni-men zai lu-shang hui yudao yixié máfan 
2SG-PL in street-on may/will meet some trouble 


"You may/will run into some troubles on the road’ 


‘Examples are words containing suffixes such as F -zi, e.g. Mi shuazi ‘brush’ (cf. ll shud ‘to brush’), and 
WR -tou, e.g. UK xidngtou ‘idea’ (cf. À xiäng ‘to think’), which are always nouns (see Basciano 2017). 


327 


Chiara Melloni & Bianca Basciano 


Thus, under the standard approach to lexemes proposed in 1.1, a relevant issue con- 
cerns the very existence of these units in the Chinese language where, at the lexical 
level, the very flexible distribution of lexical items seems to point in the direction of a 
lexicon whose base units (roots/stems) lack inherent category features. Moreover, the ex- 
amples in (3) shed light on the need for a very loose semantics of roots/stems, arguably 
incompatible with the specific semantic meaning of lexemes, as proposed in Fradin & 
Kerleroux (2003). Under the hypothesis that roots bear no category specification, their 
meaning should be ‘vague’ enough to make it compatible with the adjectival, verbal or 
nominal meanings that might be instantiated in the syntax. We may remark, however, 
that the great flexibility observed in previous stages of the language has been largely re- 
duced over the centuries, first with a functional specialization of lexemes during the Han 
period (206 BCE-220 CE), and then with the proliferation of compound words, whose 
functional preference has been always much more rigid and stable (see Zadrapa 2017). 
Even though cases of ‘regular ambiguity’ like the one in (3) are found, in Modern Chinese 
lexemes tend to be more fixed as far as lexical category and distribution are concerned; 
many roots have a ‘prototypical’ distribution and cannot be easily coerced into other lex- 
ical categories. However, even very stable words may be occasionally placed in syntactic 
slots usually occupied by other word classes, creating “innovative ambiguities” (Kwong 
& Tsou 2003: 116; see also Basciano 2017) . As observed by Zadrapa (2017), although it is 
not possible to distinguish on a formal basis the prototypical from the non-prototypical 
use, it is still possible to perceive a functional “strain” (or “pragmatic markedness” in 
Bisang’s 2008 terms), which always results in a semantic shift of varying dimension (see 
Croft 2001: 73). 


1.3 Reduplication phenomena in Mandarin Chinese 


Among word formation phenomena in Mandarin, reduplication is one of the most pro- 
ductive and, as we will see throughout this chapter, it is found across all major lexical cat- 
egories with both increasing (iconic) and diminishing (countericonic) values. Whereas 
there is no perfect correspondence between lexical categories and reduplication func- 
tions (verbs, for instance, can be reduplicated along one or the other function), we will 
see there is instead a tight correspondence between the structural pattern of reduplica- 
tion and its diminishing or increasing value, so that the two patterns are rigidly differ- 
entiated at the segmental and suprasegmental level. 

In recent years there has been a growing attention to reduplication in Sinitic. In this 
chapter, we will try to shed new light on the reduplicative processes of Mandarin, and 
try to assess the structural and interpretive properties of the input (the bases of redupli- 
cation) and the output of reduplicative processes. In particular, we will focus on the ques- 
tion of the categorial status of the base of the reduplicative processes in Mandarin, i.e. 
what the base units are and, specifically, whether reduplication applies to category-less 


?In syntactic approaches to word formation such as Distributed Morphology, the meaning of a word emerges 
constructionally once the root has been categorized by a selecting head (n, v or a) in the course of syntactic 
derivation, and cannot be determined lexically. 
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roots or to full-fledged lexemes/words. Empirically, the privileged domain of research 
will be the increasing reduplication of disyllabic bases, or, as we dub it here, the AABB 
pattern, which will be compared with the diminishing pattern, characterized by the di- 
syllabic template ABAB. 

The comparison between the two patterns will allow us to show that they differ in 
the type of units that constitute the basis of the reduplicative process. Mandarin redu- 
plication, indeed, involves base units of different ‘size’, ranging from word/lexeme-like 
units provided with category and, namely, involving the verbal domain in the case of 
diminishing reduplication, to category-less roots in the case of increasing reduplication. 
Throughout the chapter, we will provide evidence for the latter claim, i.e. that reduplica- 
tion phenomena involve roots, and we will explore some category neutral properties of 
increasing reduplication. We will conclude with some remarks on the semantic effects 
of this phenomenon, which we interpret as an increased measure function modifying the 
sortal type conveyed by the (combination of) roots. 


1.4 Outline of the chapter 


The chapter is organized as follows. Section 2 is dedicated to the presentation of the main 
patterns of full reduplication in Mandarin Chinese. Section 3 explores the characteriz- 
ing features of increasing reduplication (AABB pattern) in some detail and discusses 
its formal and interpretive properties across lexical categories. Section 4 contains the 
structural analysis and some hypotheses about the semantics of AABB increasing redu- 
plication, and section 5 draws the conclusions. 


2 Data description 


2.1 Reduplication in Mandarin: An overview 


Reduplication in Mandarin Chinese is a widespread and productive phenomenon, virtu- 
ally affecting all major lexical categories (V, Adj, N) and showing a tight relation between 
structural patterns (form) and semantic meanings (function). Semantically, Mandarin 
reduplications have augmentative/increasing and diminishing functions that are rigidly 
associated with different structural and/or suprasegmental patterns. 

The diminishing function is only found in the verbal domain. Reduplicated verbs typ- 
ically convey ‘delimitative’ or ‘tentative’ aspect (Chao 1968, Li & Thompson 1981, Tsao 
2001), meaning to do something “a little bit/for a while” (Li & Thompson 1981: 29) or, by 
extension, to do something quickly, lightly, casually or just for a try.? Both monosyllabic 
(A — AA) and disyllabic (AB — ABAB) bases can reduplicate, but only in the case of 
monosyllabic reduplication the morpheme — yi («yi) ‘one’ can occur between the base 
and the reduplicant: 


3Further, it has the pragmatic function of marking a relaxed tone, casualness (Ding 2010), and thus redupli- 
cated verbs are also used as mild imperatives (see Xiao & McEnery 2004). 
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(4) a (A) > Æ (—) À (AA) 
Jjiao jiāo (yi) jiao 
teach teach one teach 
‘teach’ ‘teach a little’ 
b. (KH (AB) — TK EIUS, (ABAB) 
xiuxi xiuxi xiuxi 
rest rest rest 
‘rest’ ‘rest a little/for a while’ 


It has been argued that this reduplicative process is a syntactic phenomenon involving 
units in the vP domain (see Arcodia et al. 2014, Basciano & Melloni 2017). First of all, the 
reduplicated complex is not a syntactic atom, since it is possible to have intervening 
morphemes between the base and the reduplicant: beyond the numeral — yi («yi) ‘one’ 
mentioned above, the perfective aspect marker [ let can intervene between the base 
and the reduplicant, as in (5): 


(5) a ÉTÉ 
zou-le | zou 
walk-Prv walk 
‘walked a bit’ 

b. EI 
zou-le yi zou 
walk-Prv one walk 


‘had a walk’ 


Moreover, diminishing reduplication is subject to event structure constraints (see Fra- 
din & Kerleroux 2003, for similar constraints in French word formation): the base verb 
must be a process verb, typically controlled by an agent and crucially lacking a result, 
which captures the fact that achievements, accomplishments and resultative compounds 
are systematically excluded from reduplication. Aspectually, the reduplicated verb is in- 
compatible with the progressive and durative aspectual markers while, as we have seen, 
it is perfectly compatible with the perfective aspect marker. Therefore, reduplication 
seems to modify the event structure of the base verb, providing a temporal boundary to 
the unbounded process expressed by the base (see Xiao & McEnery 2004). Other con- 
straints, e.g. purely morphological constraints, are not observed. 

In view of these facts and under the assumption that aspectual properties are syntac- 
tically encoded (see e.g. Travis 2000, 2010, Borer 1994, 2005, McClure 1995, Ramchand 
2008), Arcodia et al. (2014) propose that diminishing reduplication is a syntactic phe- 
nomenon affecting the vP domain, and develop a syntactic analysis to account for it; 
the reader is referred to Arcodia et al. (2014) and Basciano & Melloni (2017) for further 
details of the analysis. 


^Note that the perfective marker J le is generally placed after the second verb in resultatives and other 
kinds of compound verbs: IS Y hé-zui-le ‘drink-drunk-PFV’ vs. * Ej Y BE he-le zui ‘drink-PFV drunk’. 
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Increasing reduplication exhibits several properties that make it a very different phe- 
nomenon from diminishing reduplication. First, increasing reduplication is found mainly 
among adjectives, but it can be found with verbs and nouns/classifiers too. Consider the 


following examples of adjectival reduplication:° 
(6) a. "h(A) — /|v]s (AA) 

xiáo xido-xiao 
small small-small 
‘small’ ‘very/really small’ 

b. = (AB) — iy re; AL (A ABB) 
gaoxing gäo-gäo-xing-xing 
‘happy’ ‘very happy’ 


In the adjectival domain, the increasing function expressed by this kind of reduplica- 
tion is not necessarily ‘very Adj’, but it rather makes the adjectives more descriptive, 
indicating a higher degree of liveliness and vividness.$ As we will see in the next sec- 
tion, differently from diminishing reduplication, increasing reduplication requires that 
its base adjectives and verbs have specific structural properties. 

Increasing reduplication applies to verbs too, but only if the base is bimorphemic and 
its constituents are in a relation of coordination.’ In (7), for instance, the reduplicated 
verb portrays two interrelated actions which are performed alternately, repeatedly, or 
an action performed by a great number of people. 


(7) RE => KRIET 
lái-wáng lái-lái-wáng-wáng 
come-go come-come-go-go 
‘come and go’ ‘come and go repeatedly, come and 


go in great numbers’ 


AABB verbs, beside expressing pluractionality or action in progress (see Hu 2006, 
Ding 2010), can also express vividness (8), or acquire an extended meaning, losing their 
verbal meaning and becoming more similar to adjectives in meaning and distribution 


? According to Li & Thompson (1981: 33), in AABB reduplication of adjectives the second syllable is un- 
stressed, and thus has a neutral tone. However, there is no clear consensus on tonal patterns in this kind of 
reduplication. For example, according to Tang (1988: 282), the second syllable is in the neutral tone, while 
the third and fourth syllables, or just the fourth syllable, are in the first tone. Further, Tang observes that in 
Taiwan most people use the original tones, i.e. there is no tonal modification in this reduplication pattern 
(see also the examples in Paul 2010). 

Xu (2012a: 6) states that, when adjectives are reduplicated, the degree of the adjective’s quality is generally 
intensified. However, this does not seem to be always the case in the modern language: for example, she 
observes that colour perception can be subjective and variable, and thus adjectives indicating colours are 
prone to subjective interpretation. 

TReduplication of monosyllabic verbs (AA) in Modern Chinese does exist but has a diminishing meaning 
(see ex. (4a)). However, in previous stages of the language, before the appearance of the VV pattern with 
diminishing meaning, reduplication of monosyllabic verbs had an increasing function (repetition or action 
in progress); see e.g. Xu (2012a: 7). 
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(9). depending on the linguistic context (on the meaning of AABB verbal reduplication, 
see Hu 2006). 


(8) Wë = WERDE 
páo-tiào páo-páo-tiào-tido 
run-jump run~run-jump~jump 
‘run and jump’ ‘skip, run about, run and jump in a 
vivacious way’ 
(9) MA + MART 
tou-mo tou~tou-mo~mo 
steal-touch steal~steal-touch~touch 
‘pilfer’ ‘furtive, surreptitious, sneaky’ 


Finally, nouns can reduplicate too, conveying an overall increasing function, though 
AA reduplication no longer seems to be productive: 


5See the following examples, where {fi ARTE tou-tou-mo-mo is used as a nominal modifier (i) and as an 
adverbial, both with (ii.a) and without (ii.b) the adverbial marker H -de (examples from the Academia 
Sinica Balanced Corpus of Modern Chinese: http://lingcorpus.iis.sinica.edu.tw/cgi- bin/kiwi/mkiwi/kiwi. 
sh?ukey=-78102521&qtype=1&ssl=7 [2017-08-25]). 


O E] Boca RTS 
zui jijing yu zui tou-tou-mo-mode yi zhong dongwtt 
most astute and most furtive DET one CLF (type) animal 


*[...] the most astute and furtive animal: 
(i) a TIET 

bu yao tou-tou-mo- mo xié 
not have furtive write 
“You must not write furtively’ 

b. (CAS SLATE PG VETE P TE th 
yé jinliang bu ydo duó-zài jidoluo li tou-tou-mo- mo-de paishé 
also as.mush.as.possible not have hide-at corner in furtive-ADv take.picture 


‘Also, as much as possible, you must not hide in a corner taking pictures furtively’ 


Generally speaking, adjectives may function as adverbs, modifying verbs. Adverbs are generally formed 
from adjectives (though sometimes they can be formed from abstract nouns) but not from verbs. Basically, 
an adjective may modify both a noun/NP or a verb/VP, while a verb may only modify a noun/NP (see 
Arcodia 2014). 

It must be noted, though, that basically all reduplicated AABB verbs can have an adverbial use, and thus 
they all share an important property of adjectives: 


(ui) EP ALE EL SL PLUS OS RUSSE BB o 
qizi hé nü'ér  shuo-shuo-xiào-xiào-de ^ zhünbéi-zhe wänfàn 
wife and daughter talk-talk-laugh-laugh-Apv prepare-DUR dinner 


‘His wife and daughter were preparing dinner talking and laughing: 


(Center for Chinese Linguistics PKU corpus of Modern Chinese: http://ccl.pku.edu.cn:8080/ccl corpus/ 
index.jsp?dir-xiandai [2017-07-24]) 
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(10 a EA > XX (AA) 
tian tian-tian 
day day-day 
‘day’ ‘every day’ 
b. JB (AB) — TEAE ELL (AABB) 
hua-cäo hua-huà-cáo-cáo 
flower-plant/grass flower-flower-plant-plant 
‘flowers and plants’ '(many) flowers and plants’ 


Reduplicated monosyllabic nouns are said to have a distributive (see e.g. Li & Thomp- 
son 1981, Hu 1994, Li 2009, Xu 2012b) or plural-collective (Paris 2007) meaning. Given 
the specific meaning of monosyllabic reduplications, their lack of productivity and the 
fact that many of the nouns that can reduplicate display classifier-like properties, it is 
disputable whether AA reduplication applies to actual nouns or nominal classifiers (func- 
tional elements in the extended NP domain); we will go back to this in section 3.3. As 
for disyllabic reduplicated nouns, the disyllabicity of the base (classifiers never are disyl- 
labic) point to uncontroversially nominal bases. Semantically, Zhang (2015) argues that 
AABB reduplication is a plural marker, expressing ‘greater plurality’ (see Corbett 2000), 
but according to Xu (2012b) it indicates distributivity, as we will see in section 3.3. 


2.2 Diminishing vs. increasing reduplication 


From the brief overview provided above, a first interesting generalization arises. There is 
a correspondence between reduplicative pattern (with consistent structure and meaning) 
and lexical category, but limited to diminishing reduplication: AA or ABAB diminishing 
reduplication applies only to verbs, as input and output categories. Increasing reduplica- 
tion is very different in this respect because it cross-cuts lexical categories rather than 
being firmly associated with a word class (although AA/monosyllabic reduplication is 
unproductive nowadays with nouns and classifiers). 

Let us now focus on other differences between the two types of reduplication: it ap- 
pears that the two functions of reduplication are associated with a set of different formal 
and selectional properties. A striking fact, especially in consideration of the great deal 
of unstable meaning-structure correspondences in reduplication cross-linguistically, is 
the tight correspondence between form and function observed in the reduplication of 
disyllabic bases.” While for monosyllabic bases the difference between increasing and 
diminishing reduplication is visible only at the suprasegmental level !° for disyllabic 
bases (AB), the difference arises at the segmental level. 


?Many (if not most) languages do not exhibit such a clear correspondence between patterns and functions 
in reduplication (Mattes 2014). 

10 According to some, diminishing reduplicated verbs are toneless, whereas the reduplicated adjective always 
bears the first tone (Tang 1988: 282, Paul 2010: 120). However, according to Li & Thompson (1981: 33), 
the second syllable of reduplicated adjectives too is unstressed. As for the few monosyllabic nouns that 
reduplicate in Modern Chinese, it seems that the reduplicant keeps the same tone as the base noun. 
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In the diminishing function, the base is reduplicated as a whole (ABAB), as in the ex. 
(4b), while in the increasing function, each morpheme is reduplicated by itself (AABB), 
as seen in the examples (6b), (7)-(9) and (10b). Thus, it appears that there is a strong cor- 
relation between the function and the form of reduplication: as hinted at in section 2.1, 
the ABAB pattern always conveys diminishing meaning, whereas the AABB pattern is 
associated with increasing semantics, regardless of the word class of the input. Interest- 
ingly enough, the AABB pattern seems to be associated with increasing semantics also 
in other Sinitic languages (see Arcodia et al. 2015). 

It is worth noting that some disyllabic words predominantly showing an adjectival 
distribution can not only occur in the (standard) increasing template AABB, but they 
may also appear in the diminishing ABAB template, so that the same base eventually 
enters two reduplication templates formally and functionally distinct: 


OU a mE > CSR (A ABB) (cf. 6b) 
gaoxing £ao~gdo-xing~xing 
‘happy’ ‘very happy’ 
b. ii > RE (ABAB) 
gaoxing gaoxing gdoxing 
‘happy’ ‘have some fun’ 


Crucially, these minimal pairs are restricted to disyllabic bases amenable to a ver- 
bal/dynamic beyond an adjectival/stative interpretation, as we can see in the ABAB pat- 
tern in (11b). Therefore (11b) is not a counterexample to the generalization that only verbs 
can be reduplicated along the ABAB pattern. 

Moreover, the difference between diminishing and increasing reduplication is not only 
semantic, but also concerns the restrictions on the input and on the output. As for dimin- 
ishing reduplication, the selection restrictions, as we have seen, seem to be aspectual and 
allegedly dependent on event structure constraints, while for increasing reduplication 
these restrictions are (mostly) morphological, as we will see in the next section. 


3 Increasing reduplication: input and output 


Different from diminishing reduplication, increasing reduplication requires that its bases 
have specific morphotactic and semantic properties. In what follows we focus on the 
category-specific and category-neutral restrictions of increasing reduplication and de- 
scribe the properties of the outputs of these reduplications across the major lexical cat- 
egories. 


3.1 Adjectives 


In the adjectival domain increasing reduplication applies indifferently to monosyllabic 
and to disyllabic bases. In both cases, the base adjective must be gradable, thus absolute 
adjectives cannot reduplicate: e.g. À fang ‘square’ cannot give rise to "DD fang~fang 
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(see Paris 1979, cit. in Paul 2010: 139, fn. 18). 1 Therefore, adjectival reduplication only 
applies to bases that encode a degree/scalar value (see also Zhu 2003). At the morpho- 
tactic level, we find restrictions as far as disyllabic bases are concerned: as a matter of 
fact, the AABB pattern requires a disyllabic and bimorphemic base, whereas disyllabic 
monomorphemic words cannot be reduplicated (Paul 2010: 137):!? 


Q2) R > "Bi i SES 
yäotiäo “yäo-yäo-tiäo-tiäo 


‘graceful, gentle’ 


Also, the two morphemes must be lexical. For instance, adjectives formed with a prefix- 
like element cannot reduplicate (see Zhu 2003): 


(13) PE — "TIERE 
bü-àn *bu~bu-an~an 
not-peaceful 
‘troubled/restless’ 


It thus appears that units are here handled strictly on a morphemic basis, rather than 
on a prosodic basis. Moreover, the possible bases for AABB reduplication are either lex- 
icalized, non-transparent bases (14a), or adjectives formed by two morphemes with a 
similar meaning (14b) or in a logical coordination (14c): 


(14) a BR — BIS 
má-hu má-ma-hü-hü 
horse-tiger horse-horse-tiger-tiger 
‘careless, casual’ ‘careless, casual (stronger)’ 


"However, Tang (1988: 279-283) lists 77 77 fang~fang ‘square~square’ among possible reduplicated adjec- 
tives. This could be possibly the result of a coerced interpretation (see e.g. English very square face). In- 
deed, Tang highlights that adjectives that express distinctive properties (e.g. appearance, size and colour) 
generally can reduplicate even when, as in the case of 77 fang ‘square’, they are not used predicatively and 
cannot be modified by degree adverbs (examples from Tang 1988: 283): 


G) LAER TT 
tà de lian hén fang 
3sG.M DET face very square 
‘His face is very square. 
Gi)” (AR) Zr BARS 
(hén) fang de lian 
(very) square DET face 
‘A (very) square face’ 
(ui) SFAR 
fang~fang de lian 
square~square DET face 


‘A (very/really) square face’ 


1245 5% yaotido is an example of partial reduplication in Old Chinese, involving rhymes only, traditionally 


called @#) diéyün ‘reduplicated rhymes’: £5 $E **?iw?-liw? > ewX-dewX > ydotido (Sagart 1999: 137). 


335 


Chiara Melloni & Bianca Basciano 


b. REE = RIRES 
kudi-lé kuái-kudi-lé-lé 
pleased-happy pleased-pleased-happy-happy 
‘happy’ ‘very/really happy’ 
c. NEA = SACK 
gao-da gäo-gäo-dà-dà 
tall-big tall-tall-big-big 
‘tall and big’ ‘very/really tall and big’ 


These data show that the disyllabic AABB template applies to complex bases that are 
structurally and semantically symmetrical, i.e. exocentric or coordinative structures lack- 
ing a clearly identifiable head. Adjectival reduplication, thus, seems to be conditioned 
by morphosyntactic (word-internal) factors. 

As for the output, the reduplicated adjective loses its gradability: while the base must 
be gradable, the reduplicated adjective is no longer gradable. As a matter of fact, whereas 
the (scalar) base adjective is compatible with degree modifiers such as ‘very’ and ‘fairly’, 
which indicate a high level on the scale of the (gradable) property expressed by the 
adjective they modify, the reduplicated adjective is not: 


> JEX 


n 


(15 a fe 
cháng feicháng chang 
‘long’ ‘very long’ 
b RS m FERR 
cháng~cháng feicháng cháng~cháng 
‘long~long’ “very long-long' 


Moreover, whereas the base adjective can appear in the comparative construction, the 
reduplicated adjective cannot: 


(6) a. Peay ease mnn ° 
wë de tóufabi ta de chang 
1sG DET hair COMP 3sc.M DET long 


"My hair is longer than his: 

b. * JEJERSEECIER RE ° 
wë de toufabi ta de cháng-cháng 
1sG DET hair comp 3sc.M DET long~long 


b 


However, there is a group of adjectives for which reduplication works differently. These 
are adjectives that typically involve a modifier-head structure, such as ZS EI xué-bái 
‘snow-white’, which reduplicates as ABAB (FH À xué-bái-xué-bái). The function is 
reportedly increasing, as in the case of AABB reduplicated adjectives. This might appear 
as an exception to the form-function identity between ABAB reduplication and diminish- 
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ing meaning in Mandarin. It must be noted, though, that modifier-head adjectives like 
Æ H xué-bái ‘snow-white are not gradable and, indeed, they are not compatible with 
degree adverbs and cannot be used in the comparative construction. Therefore, redupli- 
cation does not result in a change in gradability of the base adjective, as it is the case 
with AA and AABB adjectival reduplication. Adjectival ABAB reduplication, thus, seems 
to be a phenomenon distinct from the other patterns of reduplications described in this 
section. We will go back to this issue in section 3.5., when discussing the word/lexeme 


status of the bases of increasing reduplication. 


3.2 Verbs 


As for verbs, increasing reduplication poses no aspectual requirements on the base unit 
since all kinds of verbs, including inherently telic verbs like AR lái ‘come’, È jin ‘enter’ 
or D chu ‘exit’, are allowed (see ex. (7), repeated here as (17c)). Nonetheless, increasing 
reduplication requires base units that possess specific structural properties. As a matter 
of fact, AABB increasing reduplication is generally possible only for coordinated com- 
plex verbs, the constituents of which may be either in a relation of logical coordination 
(17a), synonymy (17b) or antonymy (17c): 


(17) a Bi = PARR 
shuo-xido shuo-shuo-xido-xiào 
talk-laugh talk-talk-laugh-laugh 
‘talk and laugh’ ‘talk and laugh continuously’ 
b. "Um + n m ee 
jiao-rang jiào-jiào-ráng-ráng 
call-shout call-call-shout-shout 
‘shout, howl’ ‘shout repeatedly’ 
c A > remm 
lái-wáng lái-lái-wáng-wáng 
come-go come-come-go-go 
'come and go' 'come and go repeatedly, come and 


go in great numbers’ 


Note that in (17) the bases of reduplication are existing verbs, but this is not neces- 


sarily always the case, as e.g. #2 AE F5 zóu-zóu-tíng-tíng ‘walk and stop’ (there is no 
corresponding base verb ETF zóu-tíng).!* 


P According to Paul (2010: 137, fn. 15), “[the] reduplication pattern for ‘modifier-adjectival head’ compounds 
deriving an adjective of the form [4: ABAB] is not to be confounded with the repetition of a disyllabic verb 
as a whole in syntax: [y AB] ke, AB]”. 

14 An alternative analysis might pose that verbal AABB reduplication is the result of the coordination of 
two reduplicated verbs, [A-A] [B-B]. However, note that since the reduplication of monosyllabic verbs 
expresses a delimitative meaning, the coordination of two monosyllabic reduplicated verbs should result 
in a delimitative semantics. Further, this analysis is not tenable because telic verbs like AR lái ‘come’, as 
said above, cannot reduplicate by themselves, * JC lái-lái. 
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Also, it is worth remarking that the verbal reduplication pattern AABB may also be 
found with disyllabic monomorphemic verbs, such as (18a) or other kind of compound 
verbs (18b and 18c): 


(18 a. MIR 
duosuo 


‘tremble’ 
b. SI 
piao-you 


‘float-long/leisurely, wobble, stagger’ 
c. RS 


ndo-teng 


‘noisy-jump, disturb/create confusion’ 


As for the prosodic properties of the pattern, the second morpheme/syllable of non- 
coordinate compound verbs that can undergo AABB reduplication generally has the neu- 
tral tone, suggesting that these are lexicalized forms. Thus, similarly to adjectives, the 
AABB template in the verbal domain basically applies to structurally and semantically 
symmetrical bases, but it can also apply to unanalyzable morphemes or to lexicalized 
forms.! For some of these lexicalized forms, it is possible that they originate from co- 
ordinating structures whose relationship became opaque with time, but an in depth di- 
achronic analysis is needed to substantiate this hypothesis. 

As for the output, AABB reduplication of verbs seems to operate at the aspectual 
level, expressing repetition or action in progress. However, as we have seen, it can also 
express vividness (8), or other kinds of more abstract meanings (9), closely approaching 
adjectival reduplicative processes. 


3.3 Nouns 


As we have seen, reduplicated monosyllabic nouns are said to have a ‘distributive’ or 

‘plural collective’ meaning: 

(OVNI CN o 
rén-rén dou xihuan shóu rén ` chengzàn 
person~person all like ^ receive person praise 


‘Everybody likes to be praised by people: 


SToneless items in Chinese are typically grammatical morphemes, such as e.g. aspectual markers, (some) no 
longer productive derivational suffixes, and the second syllables of some reduplicated or compound words, 
as e.g. E baba ‘father’, Æ xuésheng ‘student’. Thus, lack of tone is a clue of either grammaticalization 
or lexicalization. 

16The only constraint which does not seem to be morphological but rather aspectual concerns coordination 
of telic verbs: as we have seen, telic verbs may appear in the AABB pattern of reduplication, but if they 
do they must be antonyms (as in ex. 7/17c), i.e. reduplication of synonymic telic verbs does not seem to be 
possible (see Zhang 2016). This might be due to the fact that the coordination of two antonymic telic verbs 
(like enter-exit) results in the annulment of the télos, which seems to suggest that, actually, the bases of 
this kind of reduplication too must express an overall atelic event. This issue deserves further research. 
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Several authors (e.g. Hu 1994, Cai 2007, Li 2009) stress the fact that reduplication of 
monosyllabic nouns may be assimilated to classifier reduplication and that many of the 
nouns that can reduplicate show classifier-like properties. For example, Hu (1994: 103) 
observes that at least part of these (alleged) nominal bases can directly follow a numeral 
without an intervening classifier, as e.g. — ^£ yi nian ‘one year’, = san hù ‘three 
households’, and they can themselves work as classifiers, as e.g. PLS san hù rénjia 
‘three household (crr) family, three families’, thus exhibiting properties of (nominal) 
classifiers. 

Reduplication of classifiers — how it is generally reported in reference grammars - 
seems to convey a distributive meaning: 


(20) ARIRE, SLIP RI BESSER RO ab o 
kan shu de shihou, shu shang de zi bu kénéng gé-gé dou rénshi 
read book DET time bookon  DET character not can CLF-CLF all know 


"You cannot know all the characters/each character of the books you read! 


According to Paris (2007: 68), however, reduplicated classifiers get a (plural) distribu- 
tive meaning when they appear in pre-verbal position (21a), while they get a plural col- 
lective interpretation when they occupy the post-verbal position (21b): 


Gu a. MWEE ERR © 
tā  gè-gè  xuésheng dou rènde 
3sc.M CLF~CLF student all be.acquainted.with 


H 


‘He knows all the students (individually) 
b. TE NT Lie RUE RR DR] E 

zai fenxi shang yujian zhóng-zhóng künnan 

at analysison meet CLF~CLF difficulty 


‘Come across all kinds of difficulties during the analysis’ 


According to Zhang (2014), reduplication of classifiers in Mandarin is a type of plural 
marking; it denotes plurality of units (groups/collectives) rather than of individuals. 
Units and individuals can overlap, like in (22a), but it is not always the case, like in (22b), 
where ‘lotus’ is the individual, while ‘lotus pile’ is the unit that reduplicates (examples 
from Zhang 2014: 6): 


(22) a. SEI) £z 
hé lí pido-zhe (yi) duo-duo liánhua 
river in float-DUR (one) CLF-CLF lotus 


"Ihere are many lotuses floating on the river: 


Paris notes that it is not possible to have the noun preceded by the reduplicated classifier in post-verbal 
position with the same meaning as (21a), so that the following sentence is ungrammatical: 
O "SITES « 
tà rènde gé-gé — xuésheng 
3sc.M be.acquainted.with CLF-CLF student 
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b. 3b EEHEEHE » 
di shang you yi dui~dui lianhua 
earth on have one crr(pile)-crr lotus 


‘There are piles of lotuses on the ground’ 


Zhang (2014: 12) argues that the distributive meaning emerges when reduplicated clas- 
sifiers occur with the adverb # dou ‘all’ (even when it is allowed but does not show up; 
see e.g. Guo 1999) or other kinds of adverbials: 


(23) {ABAD À CRASS ° 
gé-gé xuésheng dou you ziji de  wángye 
CLF~CLF student all have own DET webpage 


‘All of the students have their own webpage’ 


In contrast, according to Zhang, in (24), where no #6 dou ‘all’ is allowed, the distribu- 
tive meaning is not possible (example from Zhang 2014: 12): 


GA 8882 8 AP A en, 
shuang~shuang qíngrén bu-ru hui-chang 
CLF (pair)-CLF lover  step-enter meet-place 


‘Many pairs of lovers stepped into the meeting place’ 


According to Zhang (2014: 12), the fact that reduplicated classifiers do not have an 
intrinsic distributive reading is proven by the compatibility with collective verbs. 

Going back to reduplication of monosyllabic nouns proper, Paris (2007) argues that 
it expresses a ‘plural collective’ meaning, more specifically it denotes a collectivity of 
elements sharing the same properties, which can function either as an argument or as an 
adverbial. According to Paris (2007: 69-70), reduplication of monosyllabic units does not 
have a distributive meaning, as shown by the contrast between (25a) and (25b), where the 
first one contains a reduplicated noun (XX tian-tian ‘day~day, every day’), while the 


second contains the quantifier 4 méi ‘each’. In (25b) the object is necessarily distributed, 
) 18 


i.e. it must be a different poem every day, while this is not necessarily the case in (25a 
(25) a. ERRE — BE ° 

tà  tian-tian dou dú yi shou shi 

3sc.M day~day all read one CLF poem 


‘He reads a poem every day: 
b. AA — ARES BERT ° 

tà mëi yi tiandou dá yi shóu shi 

3sc.M each one day all read one cLF poem 


‘Every day he reads a (different) poem? 


Note that in (25a) #8 dou ‘all’ is used but, according to Paris, we do not get the distributive reading. This 
contrasts with what Zhang argues about classifiers, where the presence of this adverb would lead to a 
distributive reading (see above). 
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Providing a detailed picture of the kind of plural readings expressed by reduplicated 
classifiers is beyond the scope of this chapter; however, what we want to stress here is 
that it is not easy to trace a clear boundary between different kinds of plural readings 
and that arguably different readings can be related to distributional/syntactic rather than 
solely lexical factors. 

As for reduplication of disyllabic nouns, a first element is the undisputable categorial 
nature of the input, since classifiers are all monosyllabic. Structurally, nominal bases 
seem to be subject to the same morphological constraints observed for AABB adjectives 
and verbs. The AB base nouns usually entail a relation of coordination between their 
constituents: either logical coordination (see 26a), or synonyms or antonyms (26b) (see 
Tang 1979: 114; Zhang 2015 ):? 


(26) a XF > RRP PA 
jiā-hù jia-jia-hù-hù 
family-household family~family- 
household~household 
‘household/family’ ‘every family/each house- 
hold/many families’ 
b #2 > BED 
láo-shào láo-láo-sháo-shàáo 
old-young old-old-young-young 
'the old and 
the young' ‘old people and young people’ 


As we have seen with adjectives (14), we can also find more lexicalized forms like: 


(27 a JAN = JR, RR BY 
feng-yu feng-feng-yü-yü 
wind-rain wind-wind-rain-rain 
‘wind and rain/ 
trials and 
hardships' ‘trials and hardships/storms' 
b. Ai = AUR TT 
dián-di dián-dián-di-di 
dot-drip/drop dot-dot-drop-drop 
‘droplet’ "dribs and drabs/bit by bit’ 


The nominal AABB pattern of reduplication seems to be well-established in the Chinese 
lexicon (see e.g. Hu 1994, Wu & Shao 2001), and can be extended to disyllabic nouns that 
usually do not reduplicate (28a, Hu 1994: 106). Also, two monosyllabic nouns A and B 


P Note that some AABB lexicalized nouns do not have a AB compound counterpart (see Wu & Shao 2001: 
12): eg. ^E^ETE TE sheng-sheng-shi-shi ‘life~life-generation~generation, generation after generation ’(* tH 
sheng-shi). Generally speaking, it is possible to form AABB nouns from the coordination of two items that 


do not form an AB compound (see (28b) and the related discussion). 
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that do not form a AB compound word, but satisfy the coordination requirements seen 
above, can reduplicate along the AABB pattern forming novel combinations (28b, see 
Wu & Shao 2001: 12): 


(28) a. ES — FES 
qíng-jing qíng-qíng-jing-jing 
feeling-scene feeling-feeling-scene-scene 
‘scene, sight, 'every scene, all scenes' 
circumstances’ 

b. i — d fo e i 

pén guan pén~pén-guan~guan 
‘basin/pot’ ‘jar’ ‘pots and jars’ 


According to Zhang (2015: 7), though, the AABB nominal pattern is not productive, since 
many acceptable compound nouns formed by parallel constituents do not reduplicate 
(she argues the same for verbs too). This is however questionable since e.g. one of the 
example she mentions, i.e. Ls zhuo-yi ‘table-chair, tables and chairs’ — LN NES 
zhuo-zhuo-yi-yi ‘table~table-chair~chair’, is listed as an example of reduplicated AABB 
noun by Wu & Shao (2001: 12-13), who put it among AABB ‘temporary’ combinations 
with low frequency. Even though it is not easy to establish the productivity of a pattern, 
we believe that ‘occasional’ usages and the possibility to coin new AABB nouns are hints 
of its productivity. 

As for its function, as we have mentioned, Zhang (2015) argues that AABB expresses 
‘greater plurality’ (see also Wu & Shao 2001), though it sometimes seems to have a dis- 
tributive meaning, like in the case of reduplicated monosyllabic nouns; and, indeed, as 
we have seen, according to Xu (20122), reduplicated AABB nouns indicate distributivity. 
See the examples below:?? 


(29) a. RAP PAP BARE AA EIE EAT ERE [....] 
jia-jia-hü-hü de mén-qián dou guà-zhe 
family~family-household~household pet door-front all hang-puR 


qing-tian-bai-ri mán-di hong de gud-qi 
blue-sky-white-sun full-ground red per country-flag 
‘In front of the door of each household hung the red national flag with the 
white sun in the blue sky [...]’ 
b. HKG, BBR. EEDD, BEA AAA [...] 


häi-shuï  yü-cháng li, nán-nán-nü-ni, láo-láo-sháo-sháo, dou 


Je 


sea-water bath-site in man~man-woman~woman old~old-young~young all 
chuan-zhe gè zhong ^ bütóng kudnshide yong-zhuang 

wear-DUR each CLF(kind) different style DET swim-suit 

‘Every man, woman, old and young bathing in the sea was wearing all 
different styles of swimming suits’ 


P'Examples from Academia Sinica Balanced Corpus of Modern Chinese: http://app.sinica.edu.tw/cgi-bin/ 
kiwi/mkiwi/kiwi.sh [2016-11-24]. 
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In any case, it is possible to argue that this reduplication pattern expresses a kind 
of plural and, indeed, Xu (2012a) argues that reduplication, like plural marking, is one 
of the major devices for indicating plurality in human languages.?! This plural displays 
interesting properties: it is compatible with ‘numeral+classifier’ constructions (30a) and, 
most importantly, it seems to be compatible with the plural marker TI -men?? (30b): 


(30) a. 200 SAT FAIR LES 

erbäi duo ge zi~zi-sun~sun qiánlái zhu-shou 
200 more CLF son~son-grandson~grandson come congratulate-longevity 
‘More than 200 children and grandchildren came to congratulate [the old 
woman] on her birthday.?? 

b. [..] SERCIPIBST FLAPS RER TE Ie MERE TE e 
rang wó-men de zi-zi-sün-sün-men hai néng yikdo zhé ge 
let 1sc-PL DET son~son-grandson~grandson-p1 still can rely this CLF 


TH] 


dìqiú shenghuó 
earth live 
*'[...] to let the future generations still be able to rely on this earth to live’ ?* 

From a typological perspective, it is interesting to observe that in languages where 
reduplication and classifiers are found extensively, plural marking is not well developed 
and is sensitive to the semantic feature [+human] (Xu 20122: 12), just like in Mandarin 
(see Corbett 2000 for a more comprehensive overview of number marking across lan- 
guages). Xu (2012a) further remarks that the more plural marking is developed, the less 
this semantic feature ([+human]) is required; also, the more a language possesses devel- 
oped plural markers, the less it needs reduplication and classifiers. 

At the distributional level, the possible co-occurrence of AABB reduplication and of 
the plural marker TI -men suggests that these two forms of pluralization cannot be 
equated, and, in a syntactically oriented approach to word formation and inflection, it 
indicates that these two plurals occupy different syntactic positions in the (extended) 
nominal projection. In particular, following Wiltschko's (2008) analysis of plural mark- 
ers in Halkomelem Salish, we will argue that the reduplicative process is a derivational 
process that operates at the root level, even before root categorization is determined. 
This analysis allows us to explain the otherwise unexpected occurrence of IT ^ men plu- 


1Xu (2012b: 48) highlights some general tendencies in the languages of the world: 1) languages with oblig- 
atory plural marking tend not to have classifiers (see Greenberg 1972, Sanches & Slobin 1973; but see e.g. 
Bisang 2012); 2) languages without obligatory plural marking tend to use reduplication to express plural- 
ity. In general, languages which do not have plural marking seem to appeal to both reduplication and 
classifiers. 

22 The plural marker II -men can be added only to human nouns; it is entirely optional and is generally used 
"only when there is some reason to emphasize the plurality of the noun" (Li & Thompson 1981: 40). It is 
obligatorily used only with personal pronouns. Moreover, if the noun is preceded by a ‘numeral+ctr’, the 
marker {ff -men cannot be used: * — [EE BIBT] san ge ldoshi-men ‘three cır teacher-Pt, three teachers’ (cf. 
30a). This can be taken as an indication of the fact that fl] ^ men is a marker of pluralization connected to 
the determiner/classifier domain, rather than being involved at the NP level. 

P http://news.xinhuanet.com/society/2007-10/06/content 6833517.htm [2016-11-24]. 

24h ttp://www.china-coop.org/index.php?ac=article&at=read&did=854 [2016-11-24]. 
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ral marking on AABB (animate) nouns, which could be analysed as a modifier in the DP 
domain. We will go back to this issue in section 4. 


3.4 Further remarks on the AABB pattern 


To sum up, the data above show that increasing AABB reduplication is sensitive to the 
morphological makeup of its input, and insensitive to the categorial feature of the base 
(Adj, V, N) or, semantically, to its ontological/sortal type (whether the base denotes 
a quality, an event, or an entity/individual). As for the morphological restrictions on 
the base units, it is worthwhile noting that the requirement of a compound base of 
a specific type is also category-neutral, since it is found with AABB adjectives, verbs 
and nouns. In particular, the kind of root combinations we find seem to have much in 
common with ‘co-compounds’, in particular, with the following categories singled out 
by Wälchli (2005: 138): ‘additive co-compounds’, as e.g. Georgian xel-p’exi ‘hand-foot’; 
‘generalizing co-compounds’, as e.g. Mordvin t’ese-toso ‘here-there, everywhere’; collec- 
tive co-compounds, as e.g. Chuvash sét-su ‘milk-butter, dairy products’; synonymic co- 
compounds, as e.g. Uzbek qadr-qimmat ‘value-dignity, dignity’. 

According to Wälchli, additive co-compounds denote pairs consisting of the parts A 
and B; in a broader sense, they denote sets exhaustively listed by A and B. Generalizing 
co-compounds denote general notions (as e.g. ‘all’, ‘always’); their parts express the ex- 
treme opposite poles of which the whole consists. As for collective co-compounds, they 
are not always easy to define since they obey to different criteria, which do not always 
agree: the parts do not exhaustively list the whole; the whole comprises all meanings hav- 
ing the properties shared by A and B; collective co-compounds are co-compounds which 
denote collectives.?? Finally, in synonymic co-compounds, the constituents (A and B) 
and the whole compound have (almost) the same meaning. Wälchli observes that syn- 
onymic co-compounds "express homogeneous collection complexes in which (ideally) 
every element contained in them can be referred to by both parts ofthe co-compound" (p. 
140). This, according to Wälchli, explains the affinity between synonymic co-compounds 
and plurality, though there is no language in which synonymic compounds work as fully 
grammaticalized plurals. Synonymic co-compounds may have affinities either to collec- 
tive, to additive or to generalizing co-compounds. In any case, each type of co-compound 
described above may be considered as complexes where the referents are joint together 
to indicate a ‘set’. 

Interestingly enough, the AABB pattern can apply to AB bases that are not attested 
as coordinated bases (see sections 3.2, 3.3), and crucially it can be ‘category-changing’ 
(see Paul 2010: 145-146; cf. also ex. (9)): 


GU GUESS = [AABB] = Adj 
pó-po-mà- mà 
old.lady-old.lady-mother-mother 


‘kindhearted/sentimental/effeminate’ 


*5The example from Chuvash reported above meets all the three criteria, but it is not always the case. It is 
difficult to distinguish between additive and collective co-compounds if the first two criteria do not apply 
at the same time. 
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In (31), the AB base is not an existing word, but AABB reduplication applies to two 
free/non-conjoined lexical roots. Reduplication of two elements independently compat- 
ible with a nominal meaning?Ó results in an adjectival AABB lexeme. 

Furthermore, the AABB pattern extends to others categories too, like numerals, place 
words, coordinated classifiers, onomatopoeias, etc. (see Hu 1994): 


(32) a FTSE 
qian~qian-wan~wan 
thousand~thousand-ten.thousand~ten.thousand 


‘thousands and thousands’ 
b. BU BIR 
qián-qián-hóu-hou 
front-front-back-back 
"whole story/ins and outs' 
c. MESES RR 
xi-xi-hà-hà 
giggling.onomatopoeia-giggling.onomatopeia- 
laughter.onomatopoeia-laughter.onomatopeia 


"laughing and joking' 


All these facts seem to support the hypothesis that the AABB reduplication pattern 
applies even before the conjoined bases get their categories (and indeed the constituents 
can be bound roots too)? This is consistent with an analysis according to which word 
formation can apply to roots, or in this specific case, to combination/coordination of 
category-less roots, which would explain why, different from ABAB diminishing redu- 
plication, it is a phenomenon found across almost all word classes.?? We will go back to 
this in section 4, where we will put forth an analysis for this reduplication pattern. 


3.5 On the base units of AABB reduplications 


As we have seen in 2.2, diminishing reduplication does not form syntactic atoms and can 
be analyzed as a syntactic operation whose application is conditioned by structural re- 


26]t is worth noticing that when the base is formed by a bound root constituent, like % pó ‘old.lady’ in (31), 
we cannot determine its lexical category since bound roots do not occupy syntactic slots (see section 1.2); 
rather, it can be said that these roots are ‘noun-like’ semantically, i.e. they denote entities/individuals (see 
section 3.5). 

274 reviewer observed that it is difficult to make such a claim if the cases mentioned in this section are 
well-established lexicalized formations. Actually, these cases seem to be quite marginal, and for category 
changing items it is quite expected, since intuitively we expect that reduplication of two roots compatible 
with the nominal meaning leads to a nominal output. However, these examples further highlight the cross- 
categoriality of the pattern and further support the hypothesis of the acategoriality of the base roots. In 
any case, it is undoubtable that bound roots can enter this pattern of reduplication (see e.g. the reduplicated 
word in the examples (30) above, where both roots are bound), which as mentioned above (footnote 26; 
see also section 3.5) do not have a lexical category, and this points toward the acategorical nature of the 
conjoined roots. 

8Reduplication of non-existent AB bases is not possible with diminishing verbal reduplication; in ABAB 
verbal reduplication, the AB base must be an existing disyllabic verb. 
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strictions in the vP domain (see Arcodia et al. 2014, Basciano & Melloni 2017). In contrast, 
we have shown that increasing reduplication is subject to ‘morphological’ restrictions. 
Keeping in line with previous research on reduplication and plural marking, we argue 
that AABB increasing reduplication is the result of the modification of roots (see section 
4), understood here, as in most exoskeletal approaches (see Borer 2003), like elements 
crucially lacking category features. Moreover, as we will show in details in the next sec- 
tion, AABB reduplications are syntactic atoms which cannot allow for the insertion of 
other material between the iterated units (see e.g. Lapointe 1980). 

Different pieces of evidence speak in favour of the hypothesis that AABB reduplication 
applies to elements smaller than a word, i.e. a root/stem, and possibly lack per se a definite 
category specification. In what follows, we will concentrate on the differences between 
AABB/increasing reduplication and other reduplicative processes to illustrate our point. 

First of all, let us consider the verbal domain, where we find both diminishing redupli- 
cation and increasing reduplication. A first crucial difference between the two patterns, 
namely ABAB and AABB verbs, concerns the distribution of aspectual markers. With 
AABB reduplicated verbs, if an aspectual marker is present, it follows the whole redu- 
plicated verb (33a), as in the case of resultatives and other kinds of compound verbs 
(cf. fn. 4). In diminishing reduplication, as we have seen, the aspectual marker | leis 
unexpectedly placed between the base and the reduplicant (33b): 


(33) a. XE SAREE h TIR 0 
lian láo-Guo dou jin-jin-chu-chu-le háoji ci 
even old-Guo all enter-enter-exit-exit-PFv many time 


‘Even old Guo entered and exited from there many times??? 


b. EX T SM ° 
tà — shi-le shina "ën yifu 
38G.F try-Prv try that CLF dress 
‘She tried on that dress 


A second piece of evidence comes from ‘rhotacization’ or erhua (d. érhud), a mor- 
pho-phonological phenomenon that is very common in the speech varieties of Northern 
China, consisting in the addition of a retroflex approximant (d -r) at the end of a word. 
More precisely, phonologically, this suffix incorporates into the final syllable of a host 
stem replacing an existing coda, as e.g. S E| gongyuán — SAS gongyuár ‘park’, à 
niáo — EI Pi. nior ‘bird’. The suffix 5d -r can appear in reduplicated adjectives, and in 
the AABB pattern it occurs after the whole reduplicated adjective: 


(34) GOSS 
gàáo-gàáo-xing-xing-r 


‘really happy’ 


Phttp://www.cctv.com/program/zoujinkexue/topic/science/C15580/20060413/100489.shtml [2016-11-24]. 
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Lee-Kim (2016) observes that, even if to a lesser extent, this suffix can be also found 
in the reduplication of modifier-head adjectives (see 3.1). However, in this case the suffix 
attaches after each AB, i.e. AB-r AB-r: 


(35) SES 


xué-bái-r-xué-bái-r 


(very) snow-white’ 


According to Lee-Kim (2016), this difference between the AABB pattern and the ABAB 
pattern, as far as the suffix FE —r is concerned, suggests that these two types of redupli- 
cation have a distinct internal structure. Assuming that bd. -r adjoins to a phrasal node 
that introduces categorial information (n, v, a in DM), since it consistently occurs at 
the end of a full-fledged category, Lee-Kim argues that the contrast between (34) and 
(35) indicates that each AB forms an adjective phrase in the adjectival ABAB pattern 
of reduplication, while AABB as a whole forms a single adjectival phrase. She further 
argues that modifier-head compounds would undergo erhua before reduplication ([AB- 
r]-RED), while coordinate compounds reduplicate before P -r adjoins ( [AB-RED]-r). 
Since in the ABAB pattern bd. -r adjoins before reduplication, the double occurrence 
of this suffix (AB-r AB-r) elegantly follows: reduplication applies to the whole suffixed 
compound AB-r, copying it as a whole. According to Lee-Kim, this also suggests that 
reduplication of modifier-head compounds is phrasal, while reduplication of coordinate 
compounds targets units smaller than a phrase. A corollary of this analysis might be that 
reduplication applies both to units below and above X^, but under this view it would be 
difficult to explain that there are no constraints on the gradability of the base, in the case 
of ABAB adjectival reduplication. 

An alternative and more feasible hypothesis is that the ABAB pattern instantiates 
another kind of phenomenon, which is well attested across languages (even those ones 
that lack productive reduplication), viz. contrastive focus reduplication/repetition. Differ- 
ent from ‘morphological’ reduplication, contrastive repetition phenomena involve the 
copying of full fledge words and sometimes phrases, as in the following examples from 
Ghomeshi et al. (2004: 308), and typically have no phonological/tone reanalysis or other 
types of morpho-phonological readjustment phenomena that characterize reduplication 
in a cross-linguistic perspective: 


(36) a. lll make the tuna salad, and you make the SALAD-salad. 
b. My car isn’t MINE-mine; it's my parents’. 
c. Oh, we're not LIVING-TOGETHER-living-together. 


The semantic effect of this construction is, according to Ghomeshi et al., "to focus the 
denotation of the reduplicated element on a more sharply delimited, more specialized, 
range" (p. 308). For example, in (36a) SALAD-salad denotes green salads as opposed to 
salads in general. 

Although the interpretive difference between increasing reduplication and contrastive 
repetition is difficult to get from our Mandarin-speaking informants, we suggest that 
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reduplicated adjectives such as xué-bái-xué-bái ‘snow-white~snow-white’ 
might have a similar semantic effect, which is to express a prototypical, standard prop- 
erty denotation in the adjectival domain. As such, ABAB would be a different phe- 
nomenon applying at the phrasal level and crucially lacking the morphological con- 
straints found with increasing reduplication. In contrast, the AABB pattern operates 
below the X° level and affects the gradable property of the base, i.e. it turns a gradable 
base into a no longer gradable one (see section 3.1). 

A further element which seems to support the status of the AABB reduplicated forms 
as syntactically atomic units?? is that they are often formed by at least one bound root 
(either A or B, or both of them) which cannot stand as a syntactic word by itself (see 
section 3.4, ex. (31) and fn. 26 and 27). For instance, in the example (37) the AB base is 
formed by two bound roots (cf. the free forms 5LF érzi ‘son’ and f& f^ sünzi ‘grandson’): 


(37) TH = TTA 
zi-sun zi~zi-sun~sun 
son-grandson son~son-grandson~grandson 
‘children and ‘heirs’/‘generation after generation 
grandchildren’/ of descendants’ 
‘descendants’ 


This further corroborates the hypothesis that this process applies to roots, thus to 
acategorial elements; bound roots, indeed, have ‘nouny’, ‘verby’, ‘adjective-like’, etc. fea- 
tures, but, since they are not able to occupy a syntactic slot by themselves, they do not 
have a syntactic category proper. 


4 Analysis 


Given the properties illustrated thus far, in this section we will propose that AABB redu- 
plication is a phenomenon applying at the root level, as we briefly mentioned in section 
3.5. In particular, in the previous sections we have shown that the AABB pattern applies 
across categories and even to non-attested AB units, can be ‘category changing’ (e.g. a 
coordination of two noun-like roots may result in an adjective), can be formed by bound 
roots, and displays syntactic atomicity/lexical integrity. 

We thus propose, along the line of Wiltschko (2008) and Zhang (2015), that AABB 
reduplication constitutes a modification/adjunction process which targets category-less 
roots. 


4.1 Reduplication of (compound) roots 


Over the last two decades, frameworks of word formation, especially Distributed Mor- 
phology or Borer’s exoskeletal framework (2003), have taken very seriously the hypoth- 


Y Whether they are category-less roots/stems or standard lexemes endowed with category features will be 
discussed throughout section 4. 
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esis that roots, as the invariant core of full-fledged words (stripped away of all mor- 
phological formatives) are category-less elements, and that they must be combined in 
the syntax with category assigning heads (see among others Marantz 2001, Embick & 
Noyer 2007, Embick & Marantz 2008). Under this view, lexemes/words never are atomic 
entities, but are the spell-out forms of roots selected by a functional head, i.e. a, n, v, 
determining the corresponding phrasal domain, so that: N = [n+ V], V2 [v * V], A= [a 
+ 4]. 

Adopting this approach to word formation and its compositional analysis of lexemes, 
a possibility allowed by the system is that morphological phenomena traditionally de- 
scribed as 'derivational' do not actually target lexemes proper but category-less items, 
i.e. category-less roots. Increasing reduplication in Mandarin would then fall within the 
realm of those phenomena that apply at a very ‘low’ level in the morphosyntactic deriva- 
tion, namely before categorization takes place. Leaving aside for the moment the compli- 
cating factor that the base of increasing reduplication is nota single root but a compound 
form made up of two roots (see section 4.4 for further discussion on this), under this 
analysis, it naturally follows that the whole reduplicated AABB form can be assigned to 
different lexical categories, in accordance with the ontological (/sortal) specification of 
the root, i.e. whether it denotes objects, events, or (gradable) qualities/attributes. 


(38) a. 
nP 
n Vroot 
NX 
RED root 
b. 
vP 
Y Vroot 
Ax 
RED root 
C. 
aP 
a Vroot 
rins 
RED root 


In (38) we limited our representation to nouns, verbs and adjectives, but the analy- 
sis can be in principle extended to other categories too, like adverbs. The assumption 
that roots are atomic, non-decomposable elements virtually independent of the tradi- 
tional lexical categories (i.e. roots are not associated with categorial information, as e.g. 
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nouns, verbs, adjectives; see Marantz 1997) allows for a unified analysis of AABB redupli- 
cation across categories. Under this approach, reduplication involves acategorial items, 
and categorization is determined afterwards, in accordance with the type of category- 
determining heads, i.e. n, v, a, and under the assumption that “whatever category can 
select for roots can also select for pluralized roots, because pluralized roots are still roots” 
(see Wiltschko 2008: 60). 

While we argue, along the line of Wiltschko (2008) and Zhang (2015), that a single 
structural analysis is capable to explain for all the category patterns of increasing redu- 
plication, the interpretive outcomes of reduplication are still in need of a satisfactory 
analysis in the literature. 

As can be observed in other languages too, reduplication of nouns and verbs results 
is a (lexical) means of pluralization. The existence of lexical plurals, in particular, in the 
nominal domain is well attested across languages, with Italian, for instance, having a 
class of (feminine) nouns that are lexically specified as being plural (e.g. braccia ‘arms’, 
see Acquaviva 2008). As for the Chinese cases under consideration, according to Zhang 
(2015), AABB reduplication expresses overall a ‘greater plural’ meaning, which can apply 
both to individual-denoting and to action-denoting elements. In particular, this plural 
marker, according to Zhang, is integrated in the word-formation domain, where instead 
of categorial features, semantic features (see Cinque 1990, Lieber 2004, Lieber 2006) and 
probably phonological features, take part in the selection. 

Zhang’s analysis relies much on Wiltschko’s (2008) analysis of pluralization in Halko- 
melem Salish. Wiltschko proposes, based on different distributional properties, that in a 
language like English, with obligatory plural marking, and in a language like Halkome- 
lem, with optional plural marking, plural markers differ in their ‘way’ and place of merg- 
ing. While in English, as it is generally assumed, the plural marker spells out the plural 
value of a functional head selective for a phrasal node such as little n, in Halkomelem 
plural marking functions as a modifier of the category-less root: 


(39) a. English 


SN 


D #:PL 
#:PL DN 
H root 
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b. Halkomelem 


D 
D n 
n Vroot 


Pa 


PLURALIZER root 


According to Wiltschko (2008: 688), modifying plural markers (39b) have the syntax 
of adjuncts, rather than of selecting heads, because of a set of properties setting them 
aside from functional plurals: they are not obligatory; they do not trigger agreement; 
their absence is not associated with a specific meaning, but instead is truly unmarked; 
they cannot be selected for; they do not allow for form-meaning mismatches. 

We argue that the root-adjoined analysis in (39b) can be the correct analysis for the 
Mandarin AABB reduplication under examination, where the ‘pluralizer’ is expressed 
by means of the reduplicative pattern itself, i.e. by means of independent phonological 
copying of both base units.?! This explains for several peculiar features of AABB redupli- 
cation, such as its non-obligatoriness and cross-categoriality, as well as its compatibility 
with the plural marker IT -men, possibly used to emphasize plurality (see fn. 22), and 
with nominal classifiers. In particular, as we have noticed in section 3.3 (30b), reduplica- 
tion and pluralization are not incompatible: 


(40) TTN (extracted from ex. (30b) 
zi~zi-sun~sun-men 
son~son-grandson~grandson-PL 


‘heirs/generation after generation of descendants’ 


Furthermore, the plural meaning of increasing reduplication is not merely ‘plural’: 
since it applies to a coordination of entities/individuals which are per se inherently plural 
(AB means the sum of the entities/individuals denoted by A and those denoted by B, see 
section 3.4), its meaning is that of 'excessive/greater plural’. 

Another striking feature shared by Halkomelem Salish and Mandarin lies in the fact 
that their ‘lexical’ plural marking is not restricted to nouns, different from inflectional 
plural marking which is typically bound to nominal lexemes (not counting agreement 
plural marking, which can occur wherever it is required). This leads us to discuss the 
other lexical categories of the outputs of these reduplicative processes. 

As for the verbal domain, pluractional meaning of reduplicated verbs is certainly 
not exceptional in a cross-linguistic perspective. A great deal of reduplicative processes 


#The intriguing issue of the peculiar phonological exponence of disyllabic increasing reduplication is left 
for future investigation, but we refer to Feng (2003) for an interesting analysis within Optimality Theory 
framework. See section 4.4. for further remarks on this. 
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across languages show a pattern close to Mandarin, where (increasing) reduplication in 
the verbal domain implies repetition/iteration of the event expressed by the base, hence 
operating over the verb aspectual structure. This means that increasing reduplication 
has an inherent quantificational meaning, resulting in a plurality of individuals or in a 
pluractionality of events, in compliance with the (vague) root meaning, ultimately deter- 
mined by the type of selecting head, n vs. v, taking the reduplication as its complement 
(see (38)). Another property in common with nouns and, to the best of our knowledge, 
specific of Mandarin Chinese, is the need for a base composed of coordinated roots (es- 
pecially in the case of verbs), standing in a symmetrical relation. We will come back to 
this intriguing issue in section 4.3. 


4.2 Zooming in on adjectives 


Whereas the plural analysis seems to nicely fit the nominal and verbal domains of AABB 
reduplication, it remains to be understood what the interpretive analysis of adjective 
reduplication is. Interestingly, Wiltschko (2008) observes that in Halkomelem Salish the 
pluralizer (be it an affix, ablaut or a reduplicated form) occurs productively not only with 
nouns (41a, 41b), but with verbs (41c) and adjectives (41d) too (Wiltschko 2008: 641, 679- 


680), conveying a meaning close to the one we find in Mandarin AABB reduplication:?? 


(41) a. méle | mámele 
child ` child.Pr 

‘child’ ‘children’ 

b. q'ámi | q’alemi 


girl girl.Pr 


‘girl’ ‘girls’ 
c. qw'óqw-et qw óleqw-et 
whip-TRANS whip.PL-TRANS 


‘whip something/someone’ ‘whip something/someone several times’ 
d. kw’és ` kw'ó-kw'es 

hot hot.PL 

‘hot’ ‘real hot/very hot’ 


Wiltschko (2008) argues that, no matter whether it occurs in the context of nouns, 
verbs or adjectives, the plural marker is exactly the same. She further observes that, if 
the plural marker is exactly the same, we expect it having exactly the same meaning in 
each of these contexts. However, to determine what a root pluralizer denotes, we need to 
know what a root denotes, i.e. what its sortal type is. Wiltschko thus speculates that roots 
do not have a specific denotation (vs. nouns, which denote individualities, verbs, which 
denote eventualities, or adjectives, which denote attributes/qualities); they are able to 


32The reader should note that the unmarked form, here glossed as a singular form, is in fact compatible 
with both singular and plural interpretation; as we have mentioned, the plural marker is not obligatory in 
Halkomelem. 
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name “Events, Things, States and Qualities (see Harley 2005), and the pluralizer appears 
to simply assert that there are a lot of Events, Things, States, Qualities, depending on the 
nature of the vroot” (p. 686). 

While this intuitive explanation in principle could work for nouns and verbs, it is 
nonetheless far less accurate for depicting the increased semantics of reduplicated ad- 
jectives. Looking at the semantic effects that reduplication has on Mandarin adjectives, 
it does not seem the case that it denotes ‘lots of Qualities’. Rather, it seems that AABB 
adjectives express ‘increased intensity’, thus affecting the gradable property of the base, 
and this seems to be true also for many other languages that exhibit reduplication with 
increasing semantics (with Halkomelem pluralized adjectives not counting as an ex- 
ception in this domain, see (41d)? Since reduplication affects gradability, providing 
a greater/increased degree value expressed by the base root, we might ask what the in- 
terpretive relation is between increasing reduplication in the adjectival domain, on the 
one hand, and increasing reduplication in the verbal and nominal domain on the other, 
where reduplication is a means of quantification over entities/individuals and events. 


4.3 Wellwood's (2014, 2015) analysis of measurement functions across 
categories 


The analysis of adjectives, especially the fact that only gradable adjectives can be redu- 
plicated, sheds light on the core issue of gradability/scalarity in increasing reduplication. 
However, as we mentioned in the previous section, the relation between increasing redu- 
plication in the adjectival domain and increasing reduplication in the verbal and nominal 
domain still remains to be explained. In this section, based on the existing literature, we 
show that concepts of gradability and measurement, rather than being limited to the 
adjectival domain, may be applied uniformly across categories. This will help to sup- 
port our hypothesis on the function of Mandarin increasing reduplication, namely that 
it expresses a unique function, i.e. ‘increased measure’, as will be discussed in the next 
section. 

While according to some authors gradability is a distinctive property of adjectives 
(see e.g. Jackendoff 1977), a great deal of research over the last decades found evidence of 
gradable properties across lexical categories (see e.g. Bolinger 1972, Bresnan 1973, Doet- 
jes 1997, Neeleman et al. 2004, Caudal & Nicolas 2005, Bochnak 2010). As observed by 
Nicolas (2010), gradable expressions are found among: plural count nouns (more dogs), 
but not singular count nouns (*more dog, "less cup); mass nouns, concrete (more water, 
less wine) or abstract (more sadness, less playfulness); adjectives (smaller, less sad); verbs 
(to work more/less). 

Wellwood (2015) puts forward a unified account of comparison across categories, chal- 
lenging those theories that consider gradable adjectives as elements specifying measure 
functions (see above) vs. nouns and verbs, which allegedly do not express such measure 
functions. According to this scholar, ^which dimensions are possible across domains is a 


33 According to Xu (20122), reduplication is iconically motivated, and ‘positive degree’ constitutes its core 
meaning. 
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consequence of what is measured, rather than which expressions measure" (p. 69). Well- 
wood (2015: 69) also observes that a noun like coffee introduces individuals that can be 
measured, while a verb like run introduces events and an adjective like tall introduces 
states; in any case, they all can be measured along certain types of dimensions, specifi- 
cally those which respect ‘part-whole’ relation (e.g. volume and weight for soup, but not 
temperature; time and distance for run, but not speed?^). She posits a variable in nominal 
and verbal domains "that ranges over measure functions, restricted to just those that are 
homomorphic to the measured domain" (p. 68). Wellwood (2014, 2015) argues that com- 
parative sentences in the adjectival, nominal and verbal domain all contain instances of 
a single (phonologically overt or covert) morpheme that compositionally introduces de- 
grees; "this morpheme, sometimes pronounced much, contributes a structure-preserving 
map from entities, events, or states, to their measures along some dimension.” (Wellwood 
2015: 67). 

This approach characterizes the notion of *neasurement" uniformly in terms of struc- 
ture-preservation across comparative constructions and unifies the contrasts existing 
(within each category) between gradable and non-gradable adjectives, between mass 
and count nouns, and between atelic and telic verb phrases.?? Wellwood observes that 
mass nouns tend to show cumulative reference: "if coffee applies to two portions of mat- 
ter, then it also applies to the mereological sum of those portions" (p. 71). In contrast, 
count nouns, when interpreted singularly, tend to show non-cumulative reference: “if 
a cup applies to a given object, it fails to apply to any of its (relevant) proper parts" (p. 
71). Therefore, the semantics of mass nouns is modelled in terms of a domain structured 
by the part-of relation, while that of a noun like cup lacks such structure. Similarly, 
atelic predicates (like mass nouns) tend to show cumulative reference, while telic pred- 
icates tend to show quantized, non-cumulative reference. If run in the park applies to 
two stretches of activity, it also applies to their sum; thus atelic events have domains 
structured by the part-of relation on events. In contrast, if run to the park applies to an 
event, it fails to apply to any of its relevant subparts; thus telic events lack the part-of 
relation (Wellwood 2015: 73). 

As for adjectives, Wellwood proposes that non-gradable adjectives, which express 
quantities that either exist or not (a table is either square or not, it cannot be more or less 
square) are formally parallel to (singular) count nouns and telic predicates, while grad- 
able adjectives, which express quantities that there may be more or less of (a thing can 
be more or less hot), are parallel to mass nouns and atelic predicates. They both express 
predicates of states, the difference being that gradable adjectives, unlike non-gradable 
ones, predicate of ordered states: they associate directly with sets of ordered degrees, or 
scales. Besides, Wellwood assumes that the measure functions introduced with gradable 
adjectives are not only homomorphic to the ordering relations on the measured domain, 
but to non-trivial part-whole relations. 


34For example, she observes that larger portions of soup have greater measures by volume or weight than 
smaller portions, but generally this is not the case with measures by temperature. 

35Gradability presupposes the existence of a scale, and can be seen as related to tboundedness (see Paradis 
2001, Alexiadou 2010). 
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Therefore, instead of adopting a notion of ‘measurement’ based on a variety of mea- 
sure functions acting on the same objects in unpredictable ways, Wellwood proposes 
that language encodes measurement of different sorts of things in limited ways. Accord- 
ingly, she elaborates a uniform account of measurement as a monotonic mapping from 
ordered sets of entities, events, or states to degrees. 


4.4 Reduplication as increased measure 


Let us now try to combine the structural analysis of increasing reduplication proposed 
in section 4.2 with the cross-categorial (strictly compositional) analysis of measurement 
functions proposed by Wellwood (2014, 2015). Keeping with Wellwood’s proposal that 
there are no differences in the type of measurement functions among the lexical cat- 
egories at a higher level of syntactic/semantic composition, we speculate that redupli- 
cation conveys a similarly stable/unique function but it targets elements lacking any 
specification in terms of formal features.% In particular, we wish to argue that redupli- 
cation expresses a unique function, i.e. ‘increased measure’, that constantly applies to 
roots, only differing in their ontological denotation. Therefore, increasing reduplication 
is a very low-level (morphological) adjunction operation which conveys the function 
‘increased measure’ to the roots it applies to: the semantic effects obtained (pluraliza- 
tion, pluractionality, intensification of the base gradable property) ultimately depend on 
the different sort of things reduplication modifies, and arguably emerge construction- 
ally, that is, after root categorization applies. It should be noticed that, semantically, 
similar results might be obtained at higher level of syntactic composition via different 
means, depending on the categorial domain of application, i.e. through fully-fledged de- 
gree phrases in the adjectival domain (see En. ‘very Adj’, eg. very good; Ch. ‘4 hën 
Adj’, e.g. {iu BE. hén gaoxing ‘very happy’), and through the use of plural affixes and 
aspectual markers in the nominal and verbal domain respectively. 

This analysis, however, does not account for some relevant asymmetries across lexi- 
cal categories previously noted in the literature (see Zhang 2015). As it has been argued 
in section 3, the main difference at the structural level between adjectives, on the one 
hand, and nouns and verbs, on the other, concerns the obligatoriness of disyllabic bases 
for the latter. That is, whereas increasing reduplication applies to quality-denoting roots 
that may be either mono- or disyllabic, resulting in AA and AABB patterns interpretively 
equivalent, with entity and event denoting roots it targets disyllabic units, resulting ex- 
clusively in the AABB pattern.?? 

As we have seen in 3.3, the AABB reduplication pattern requires a coordinate base, 
ie. two elements related in a symmetrical fashion, either in a logical coordination, or 
synonyms or antonyms; thus, instead of having a single root we have a combination 


"ëtt is worth reminding that roots have a strongly underspecified semantics which allows them to be com- 
patible with the semantics of adjectives (as properties of attributes), verbs (as properties of events), nouns 
(as properties of individuals). 

"The generalization holds under the assumption that AA monosyllabic reduplication in the nominal domain 
should be rather understood as reduplication of classifiers (see section 3.3). We do not have an analysis of 
this type of reduplication yet, and we leave the issue for future research. 
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of roots. These roots are joined together to form a set, whereby the two constituents 
equally contribute to the semantics of the whole complex stem, i.e. they are in a sym- 
metrical relation. Structurally, it is worth emphasizing that these operations all apply at 
the root level, resulting in a recursive application of ‘morphological’ phenomena, with 
(symmetrical) compounding and reduplication rigidly ordered in the derivation, yet both 
applying before categorization (see Zhang 2015): 


(42) 
n/v/aP 
n/v/a Vroot «— — A[AABB] n/v/a CATEGORIZATION 
RED Vroot <—— A[AABB] REDUPLICATION 


a 


VrootA VrootB «——— [AB] COMPOUNDING 


This analysis seems to produce the surface pattern ABAB, since reduplication applies 
to a compound base AB. However, prosodic patterns within AABB structures actually 
seem to support the structural analysis in (42). In particular, Feng (2003) examines tone 
sandhi rules within disyllabic reduplication and, for AABB, he argues that these rules ap- 
ply first between the second A and first B and then between the first B and second B. On 
this basis, Feng argues that AB is the actual morphological unit, whereas AA and BB are 
not, resulting in the structural analysis [A[AB]B] (Feng 2003: 7-8). The issue deserves fur- 
ther investigation especially aimed at explaining the reason for the mismatch between 
underlying structure, supra-segmental patterns and surface order of morphemes, for 
which at the moment we cannot offer an explanation. Suffice it to say that the prosodic 
pattern of AABB provides evidence in favour of the analysis in (42). 

At the interpretive level, we put forward that the combination of two roots which act 
as the base for the AABB reduplication process forms itself a sort of ‘plural/collective’ ex- 
pression and reduplication provides an increased measure for this kind of expressions. It 
has been noted that AABB nouns express greater plural (possibly differing in the seman- 
tics from AA reduplication of nouns/classifiers, most typically expressing a distributive 
meaning), and a similar effect is obtained with AABB verbs (ex. in (43a) and (43b) are 
adapted from examples (22, 24) in Zhang 2015): 


(43) a. BREE 
zhi-zhi-yé-yé 


28 


twig-twig-leaf-leaf 

‘twigs and leaves’ 
b. ese tH 

féng-féng-bü-bü 


sew-sew-repair-repair 


‘sew and repair repeatedly’ 
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A possible explanation for this structural requirement might lie in the different onto- 
logical type of roots: in particular, individual and event denoting roots, different from 
quality denoting roots, seem to require an inherently plural interpretation in order to be 
measured. As a matter of fact, typically comparative expressions with more in English 
require either mass nouns or plural nouns, but exclude singular nouns (more dogs vs. 
“more dog). Similar effects obtain in the domain of verbs with the contrasts between telic 
and atelic verbs discussed by Wellwood (2015). 

Although at this point the present analysis becomes very speculative, we put forward 
here that a principled reason for the necessary disyllabicity of nominal and verbal bases 
might have the same source of the asymmetry observed in the domain of comparative 
expressions. Specifically, if the semantics of roots is very vague and compatible with any 
interpretation which eventually emerges at higher levels of syntactic composition, a way 
to introduce gradability at the level of roots is to merge them directly, so to create a col- 
lection of individuals, like e.g. BIC nán-nü ‘man and woman’ (which is reduplicated as 
H&L nán-nán-nü-nü ‘men and women’), or of events, e.g. HEIR qi-fü ‘rise and fall’ 
(which is reduplicated as EERIK qi-qi-fü-fü ‘rise and fall repeatedly’). In this view, 
the first merger provides reduplication with the 'gradable base' over which it can apply 
its increased measure function. On the contrary, roots that are selected by an adjectival 
head (i.e. a) would inherently express a gradable property and, accordingly, reduplica- 
tion would not pose specific disyllabic requirements on these base units. Furthermore, if 
this is the case, we expect no difference in meaning between the reduplication of AA and 
AABB adjectival forms, as confirmed by the data (see examples (6a) and (6b) in section 
2.1, repeated below for the reader's convenience): 


(44) a. (A) — /|v]s (AA) 
xido xido-xiao 
small small-small 
‘small’ ‘very/really small’ 
b. = (AB) — m eg BEER. (A ABB) 
gaoxing gäo-gäo-xing-xing 
‘happy’ ‘very/really happy’ 


5 Conclusion 


Reduplication is a challenging phenomenon in many respects: it is hardly amenable to 
a uniform characterization in a cross-linguistic perspective, given the extreme variety 
of forms and functions it is associated with; further, it can surface with different forms 
and meanings within a single language too, as we have shown with the reduplicative 
processes of Mandarin under consideration; it can manifest semantic functions closely 
related to the inflectional/functional domain, but it approaches more closely the domain 
of derivation/word formation; finally, it can take as its base units elements of different 
size, ranging from lexeme/word-like units in one domain (diminishing reduplication, 
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which implies verbal reduplication in Mandarin) to category-less units in the other (in- 
creasing reduplication). 

The case of diminishing reduplication seems to involve units as ‘big’ as lexemes, i.e. 
stems endowed with category features and with specific (aspectual) semantics, as we 
have shown in section 2.1. The case of increasing reduplication, however, points to the 
existence of word formation phenomena that applies below the lexeme level. In particu- 
lar, increasing reduplication seems to suggest that it is a phenomenon that can apply at 
a very ‘low level’, namely, that it can merge with roots/stems lacking category specifica- 
tion. Further, it is per se unable to express a definite category, given its presence across 
all major lexical categories at both input and output levels. Therefore, the present case 
study sheds some light on the existence of word formation that does not take lexemic 
inputs and does not give lexemic outputs either. 

On the one hand, this study brings further evidence in favor of a neo-constructionist/ 
DM-like view of the lexemes or word units as syntactically complex elements, and ul- 
timately for the very existence of category-less roots. On the other hand, the curious 
asymmetries observed in the domain of increasing AA and AABB reduplication, whereby 
adjectives seem to part company from verbs and nouns, call into question the semantic 
(ontological?) character of roots and their alleged requirements for insertion in the syn- 
tactic structure responsible for category assignment and, overall, for their morphosyn- 
tactic properties and distribution. This is a very complex issue on which we hope to have 
contributed some further empirical and theoretical basis but that, it goes without saying, 
needs further research and ampler empirical coverage to be satisfactorily addressed. 

To conclude, our research has explored the structural and interpretive effects of redu- 
plication, so productive in Mandarin (see Basciano & Melloni 2017) and broadly attested 
across Sinitic (see Arcodia et al. 2015) yet still lacking a satisfying analysis, despite of 
a growing interest in the last years. So doing, we hope to have paved the way for a 
better understanding of Mandarin reduplication specifically, and more in general for an 
approach to word formation which seeks to reinterpret morphology-specific properties 
and restrictions within a more integrated model of grammar, where syntax is also re- 
sponsible for word formation. 
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La parasynthése à travers les modèles : 
Des RCL au ParaDis 
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Cet article est consacré à l'analyse des formes dites parasynthétiques, à la facon dont cette 
analyse a évolué avec les modèles théoriques qui l'ont appréhendée, et à la manière dont, en 
retour, elle a contribué à leur changement. L'évolution de l'analyse de la dérivation parasyn- 
thétique peut en effet étre percue comme un indicateur des transformations et des progrés 
des théories morphologiques et des modéles dérivationnels. Nous montrons notamment 
comment les propositions successives pour l'analyse de ce phénoméne ont conduit à un 
assouplissement progressif des cadres théoriques, à partir des modèles morphémiques où 
formes et sens sont totalement associés au sein des morphémes, en passant par les lexémes et 
les Règles de Construction de Lexémes (RCL) qui procèdent à une première séparation entre 
les trois dimensions du lexéme (forme, catégorie et sens), pour arriver aux modèles paradig- 
matiques de la morphologie dérivationnelle ot la relation binaire entre base et dérivé est 
généralisée à des réseaux de lexémes connectés à des réseaux de propriétés. Cette progres- 
sion nous conduit, enfin, à notre objectif final : la présentation du modèle d'analyse construc- 
tionnel ParaDis, dont la genése résulte de l'aboutissement des transformations théoriques 
successives ou paralléles qui ont faconné les différents courants en morphologie dérivation- 
nelle. Les principes d'analyse de ParaDis combinent les principes formels qui sous-tendent 
les RCL et la structure tridimensionnelle des lexémes à une approche en réseau de la cons- 
truction lexicale. A travers l'exemple de la préfixation en anti-, nous montrons comment 
cette association originale fait de ParaDis un cadre qui dispose des propriétés et des clés 
nécessaires pour analyser de maniére simple et intuitive les constructions parasynthétiques. 


1 Introduction 


La morphologie dérivationnelle, bien plus que la morphologie flexionnelle, comporte 
une quantité importante de constructions difficiles à décrire du fait du nombre et de 
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la diversité des variations observées. Il existe en morphologie flexionnelle des modèles, 
comme celui de (Stump 2016), capables de décrire la totalité du systéme pour la plupart 
des langues européennes. Une grande partie de l'effort de recherche dans ce domaine 
porte sur l'optimisation des systémes du point de vue de leur complexité computation- 
nelle, des notions dont ils font usage ou de leur plausibilité psychologique. La situation 
est trés différente en morphologie dérivationnelle oà un grand nombre de phénoménes 
n'ont toujours pas recu une analyse complète satisfaisante. C'est le cas des trés nom- 
breuses formations non canoniques, au sens de Corbett (2010), dont les constructions 
parasynthétiques constituent un exemple bien connu. Ces constructions sont un objet 
d'étude à la fois récurrent et ancien en morphologie. En particulier, ce phénoméne, qui a 
interessé les chercheurs francais depuis Darmesteter (1877, 1894), a largement été traité 
dans le cadre des modéles génératifs des années 1970 (Dell 1970, 1979), puis par les spé- 
cialistes majeurs de la morphologie en France, notamment Corbin (1980, 1987) et Fradin 
(1997a, 1997b, 2003). 

Nous nous intéressons dans cet article à l'analyse de ces formes (section 2), à la facon 
dont elle a évolué avec les modèles théoriques qui l'ont appréhendée, et à la manière dont, 
en retour, elle a contribué à leur changement. L'évolution de l'analyse de la dérivation 
parasynthétique peut en effet être perçue comme un indicateur des transformations et 
des progrés des théories morphologiques et des modéles dérivationnels. Nous montrons 
notamment comment les propositions successives pour l'analyse de ce phénoméne ont 
conduit à un assouplissement progressif des cadres théoriques, à partir des modéles mor- 
phémiques (section 3) où formes et sens sont totalement associés au sein des morphémes, 
en passant par les lexémes et les Régles de Construction de Lexémes (RCL; section 4) 
qui procèdent à une première séparation entre les trois dimensions du lexéme (forme, 
catégorie et sens), pour arriver aux modéles paradigmatiques de la morphologie déri- 
vationnelle (section 5) ot la relation binaire entre base et dérivé est généralisée à des 
réseaux de lexémes connectés à des réseaux de propriétés. 

Cette progression nous conduit, en section 6, à notre objectif final : la présentation 
du modèle d'analyse constructionnel ParaDis, dont la genèse résulte de l'aboutissement 
des transformations théoriques successives ou paralléles qui ont faconné les différents 
courants en morphologie dérivationnelle. ParaDis hérite en particulier de deux approches 
dont la contribution a été décisive dans l'évolution de la prise en compte des dérivés 
parasynthétiques en particulier, et, plus généralement, des constructions dérogeant aux 
principes de canonicité dérivationnelle. Il s'agit d'une part des travaux présentés dans 
Fradin (2003), et, d'autre part, des analyses développées à Toulouse en réaction à ces 
propositions théoriques. 

ParaDis est développé comme une articulation de l'approche toulousaine avec la dis- 
sociation des niveaux formel, catégoriel et sémantique que permet la formalisation du 
lexéme et des RCL développée par Fradin (2003). Le socle du modèle ParaDis est élargi 
aux patrons cumulatifs de Bochner (1993) et fait des relations morphologiques dériva- 
tionnelles l'une de ses unités fondamentales. Ses principes d'analyse combinent ainsi les 
solutions de l'approche toulousaine, les principes formels qui sous-tendent les RCL et 
la structure tridimensionnelle des lexémes à une approche en réseau de la construction 
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lexicale. A travers l'exemple de la préfixation en anti-, nous montrons comment cette 
association originale fait de ParaDis un cadre qui dispose des propriétés et des clés néces- 
saires pour analyser de maniére simple et intuitive les constructions parasynthétiques. 


2 Constructions dites parasynthétiques 


On parle de dérivation « parasynthétique », terme introduit par Darmesteter (1875, 1877), 
pour décrire les structures dérivées (i) qui sont une instance du patron pref-X -suf et qui 
(ii) présentent un décalage entre leur sens et leur forme. En francais, les dérivés parasyn- 
thétiques sont essentiellement adjectivaux (ex. grévey — antigréviste,), ou verbaux (ex. 
sensible, — désensibilisery, ratu — dératisery), méme si des études font également état 
de parasynthétiques nominaux (ex. col — encolurey)'. 

En dehors du français, les dérivations parasynthétiques s'observent très largement 
dans les langues romanes (Reinheimer-Ripeanu 1974, Serrano Dolader 2015) : en portu- 
gais (1a), (Basilio 1991); en italien (1b), (Guevara 2007, Iacobini 2004, Melloni & Bisetto 
2010, Scalise 1994); en espagnol (1c), (Serrano Dolader 1995, Schroten 1997); mais aussi 
en grec (1d), (Efthymiou 2014) et dans les langues slaves comme le slovaque (1e) ou ger- 
maniques comme l'allemand (1f), où certains types de composés dits « synthétiques » ou 
« exocentriques » présentent une configuration analogue (voir entre autres Neef (2015), 
Gaeta (2010), Crocco-Galéas (2003), Chovanová (2010) et pour un panorama complet, 
Lieber & Stekauer (2009)). 


(1 a. a-X-cer 
apodrecer ‘pourrir’ où X = podre ‘pourri’ 
b. extra-X-ale 
extramatrimoniale ‘hors mariage’ ot. X = matrimonio ‘mariage’ 
c. sub-X-ino 
submarino 'sous-marin' oü X = mar ‘mer 


d. apo-X-izo 

apokefalizo ‘décapiter’ où X = kefale ‘tête’ 
e. Y-X-y 

bosonohy ‘aux pieds nus’ où Y = bosy ‘nu’ et X = noha ‘pied’ 
f Y-X-ig 


blauaügig ‘aux yeux bleus’ où Y = blau ‘bleu’ et X = Auge ‘oeil’ 


Une propriété commune à ces constructions est la variabilité des valeurs que peut 
prendre la séquence X-suf à pref- constant. Les exemples sous (2) illustrent en francais 


1Un autre type de construction a longtemps été considéré comme faisant partie de cette classe de dérivés. 

Il s'agit des verbes formés par préfixation, comme dépoussiérer en francais ou son équivalent spolverare en 
italien. Pour ses défenseurs, cette analyse repose sur deux justifications : (i) la préfixation serait dépourvue 
de pouvoir catégorisateur, et (ii) la marque flexionnelle suffixale qui apparait systématiquement sur les 
verbes dans les langues romanes posséde un pouvoir dérivationnel dont l'explication fait intervenir des 
facteurs diachroniques (Crocco-Galéas & Iacobini 1993, Iacobini 2010, Acedo-Matellán & Mateu 2009). 
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la construction d’adjectifs adversatifs qui comportent tous le préfixe anti-. La séquence 
suffixale (-al, -ique, -aire, -eux, -ique) varie, sans que cette variation n’ait un impact sur 
l'interprétation de l'adjectif. Pour un X donné, on remarque que cette séquence est iso- 
morphe à l'exposant de la régle formant l'adjectif dénominal X-suf (ex. gouvernemental 
‘en relation avec le gouvernement’). 


(2) anti-X-al (antigouvernemental où X = gouvernement), 
anti-X -ique (antialcoolique où X = alcool), 
anti-X -aire (antiparlementaire où X = parlement), 
anti-X -eux (anticancéreux où X = cancer). 


Relativement aux critéres de canonicité énoncés par Corbett (2010), les dérivés para- 
synthétiques présentent un éloignement clair vis-à-vis de la situation idéale, représentée 
par l'observation concomitante de deux propriétés sur un dérivé : transparence formelle 
et compositionnalité du sens. Dans les formations parasynthétiques en effet, les deux 
découpages formels possibles, pref- +X-suf et pref-X+ -suf, sont incompatibles avec la 
décomposition sémantique : il y a dans le dérivé une marque formelle (i.e. -suf) non cor- 
rélée à un élément servant la construction du sens. En d'autres termes, on a affaire ici à 
un cas de ce que Hathout & Namer (2014b) nomment « surmarquage formel », exprimé 
par une séquence phonologique suffixale dont la forme est variable. 

Pour résoudre cette divergence, les modéles morphologiques développent trois types 
de stratégies complémentaires : 


1. ils privilégient l'interprétation du dérivé et amendent les principes théoriques pour 
rendre compte de sa forme; 


2. ils favorisent la régularité formelle au détriment de la construction du sens, pour 
laquelle ils mettent en ceuvre des aménagements particuliers; 


3. ilsinterviennent sur les deux niveaux pour que les représentations se correspondent. 


Nous allons voir (section 3) que les modéles morphématiques, que ce soit ceux qui re- 
lévent du cadre Item et Arrangement ou ceux qui adoptent une conception plus fonction- 
nelle du morphéme affixal (Corbin 1987), choisissent la premiére option. Nous montrons 
ensuite, dans la section 4, comment Fradin (2003), qui inscrit son modèle dans le courant 
lexématique de la morphologie, opte pour la deuxiéme solution. Enfin, nous expliquons 
en section 6, comment le système ParaDis, conçu comme une synthèse des principes pré- 
sentés en section 4 et des propositions toulousaines (section 5), s'efforce de suivre la 
troisiéme des stratégies listées ci-dessus. 


3 Parasynthése et morphologie morphématique 


Les principes à l’œuvre dans le courant morphématique traditionnel de la morphologie 
dérivationnelle concoivent la construction d'un mot comme le fruit de concaténations 
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binaires successives conformément à la Binary Branching Hypothesis héritée du structu- 
ralisme et adoptée en morphologie par Aronoff (1976), Booij (1977) entre autres?. L'un 
des deux constituants réunis dans une régle est un morphéme affixal, unité minimale de 
forme et de sens, qui contraint les propriétés phonologiques, catégorielles et sémantiques 
de l'autre constituant, i.e. la base à laquelle il se combine. Ces contraintes affectant si- 
multanément ses dimensions sémantique et formelle, deux difficultés apparaissent pour 
l'analyse des parasynthétiques. Les deux illustrent le décalage entre forme et sens mais 
correspondent chacune à l'une des réalités que recouvre la notion de parasynthése : 


1. La base du dérivé parasynthétique est non attestée. Pour le verbe dératiser par 
exemple, aucune des deux bases que l'on obtiendrait en supprimant le préfixe ou 
le suffixe n'est attestée. Ce verbe n'est ni préfixé sur ratiser ni suffixé sur dérat. 


2. Le sens du dérivé est non compositionnel, comme l'illustrent les exemples antigré- 
viste, et désensibilisery. À premiére vue, le verbe désensibiliser pourrait avoir pour 
base sensibilisery, mais le sens de désensibiliser n'est pas une fonction de celui de 
sensibiliser ; en effet, « désensibiliser une dent » ne signifie pas ‘annuler le résul- 
tat obtenu à l'issue de l'acte de sensibiliser la dent’, mais ‘priver une dent de son 
caractère sensible’. De la méme manière, le sens de antigréviste ‘qui s'oppose à la 
gréve’ ne fait intervenir ni celui de gréviste (ce qui supposerait une préfixation en 
anti- appliquée à une base suffixée), ni celui d'antigréve (ce qui correspondrait à 
la suffixation par -iste d'un mot préfixé en anti-). 


Citant Serrano Dolader (1995 : 23-74), Iacobini (2004 : 167) résume en trois schémas 
d'analyse les solutions que les tenants du cadre morphématique adoptent pour les déri- 
vés parasynthétiques. Outre la solution consistant à préserver la binarité des régles de 
dérivation, et à activer successivement les procédés de suffixation puis de préfixation ou 
vice-versa, la démarche parasynthétique préconise soit la concaténation simultanée du 
préfixe et du suffixe au morphéme de base, soit l'attribution du statut de circonfixe à la 
séquence formée par le préfixe et le suffixe; la troisiéme approche, défendue dans Corbin 
(1980, 1987), passe par l'attribution d'un pouvoir catégorisateur au processus de préfixa- 
tion, ce que récusent par exemple Scalise (1984) ou Alcoba-Rueda (1987). Nous illustrons 
ci-dessous ces types d'analyse à travers les exemples des verbes dératiser (« dératiser Y » 
signifie ‘enlever les rats de Y") et de désensibiliser (« désensibiliser Y » signifie ‘priver Y 
de son caractére sensible"). 


3.1 Application séquencielle de dé- et -iser 


Pour préserver la nature à la fois homocatégorielle et binaire des régles de combinaison 
préfixe - base, l'analyse des dérivés parasynthétiques que défend par exemple Alcoba- 
Rueda (1987) consiste à voir dans dératiser et désensibiliser le résultat de la concaténa- 
tion du préfixe dé- appliqué, respectivement, au nom rat et à l'adjectif sensible, suivi de 


?Voir aussi Guevara (2007) pour une justification théorique et Heyna (2014) pour un panorama complet des 
traitements proposés en francais dans ce cadre théorique, ainsi que pour une proposition d'analyse des 
dérivés parasynthétiques adjectivaux en anti- et verbaux en dé-. 
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celle du suffixe -iser, sélectionnant la base non attestée nominale °dérat- ou adjectivale 
*désensibl- obtenue à l'issue de la premiére étape. L'analyse de dératiser, traduite dans 
une notation parenthésée permettant de coder son histoire dérivationnelle, est donnée 
en (3) et celle de désensibiliser en (4). Chez Scalise (1984), le raisonnement est le méme, 
à l'ordre de l'application des régles de combinaison affixe - base prés. Pour cet auteur, 
la dernière étape de la construction de désensibiliser (resp. dératiser) est la concaténa- 
tion du préfixe dé- à une base suffixée par -iser (sensibiliser) éventuellement non attestée 
(ratiser). Ces dérivations correspondent, respectivement, aux représentations données 
en (5) et (6). 


(3) [[ dé- [rat]y]y -iser ]y 
(4) [[dé-[sensible]A]A -iser ]y 
(5) [dé [[rat]y -iser ]v]v 
(6) [ dé- [[sensible], -iser ]y]y 


3.2 Application simultanée de dé- et -iser 


Une autre des solutions proposées pour expliquer la construction de dératiser et désensi- 
biliser est l'adjonction simultanée de dé- et de -iser sur le nom rat ou l'adjectif sensible. 
Les représentations schématiques de l'analyse de ces deux verbes sont (7) et (8), respec- 
tivement. 


(7) [dé [rat], -iser ]y 
(8) [dé- [sensible]; -iser ]y 


Il s'agit de ce que Booij (2002) appelle « synaffixation » et qui revient à admettre l'exis- 
tence de régles ternaires : deux opérateurs s'appliquent simultanément à la méme base. 
L'inconvénient de cette solution est que chacun de deux affixes selectionne en temps 
normal un constituant nominal ou adjectival : dé- sélectionne herbe pour former désher- 
ber et -iser, cristal pour cristalliser ; dé- se combine avec saoul dans dessaouler et -iser est 
concaténé à fertile dans fertiliser. L'application simultanée des deux morphémes à une 
méme base est corrélée à une contribution combinée de leur contenu sémantique : la pri- 
vation pour dé- et la cause pour -iser. Mais cette analyse contredit le principe d'unicité 
sémantique des morphémes —le sens de dé- dans désherber cumule par exemple les opé- 
rateurs de privation et de cause— et celui de combinaison des morphémes, selon lequel 
une régle de réécriture est binaire et n'associe qu'une téte affixale au constituant régi 
par cet affixe. 


3.3 Circonfixation 


Des auteurs comme Bosque (1983) proposent d'analyser les séquences affixales comme 
des « morphémes discontinus » ou « circonfixes » ; cette approche revient à considérer 
dé-...-iser comme un affixe unique dont la combinaison avec rat ou sensible respecte le 
principe de binarité des régles de réécriture du modéle. On obtient alors une construction 
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en une étape : (9) est la représentation de l'analyse de dératiser, et (10), celle de désensibili- 
ser. Cependant, l'utilisation d'un circonfixe pose plusieurs problémes : (i) elle n'explique 
pas comment est choisie la valeur de la séquence suffixale (par exemple, pourquoi a- 
t-on -iser dans dératiser, mais -ifier dans dégazéifier ?); (ii) elle contrevient au principe 
d'unicité du morphème : dé-...-iser, dé-...-ifier et dé- sont des morphémes synonymes, 
mais la variation allomorphique qui les distingue n'est pas imputable à des contraintes 
morphophonologiques. 


(9) [ dé-...-iser [rat] ly 
(10) [ dé-...-iser [sensible], ]y 


3.4 Préfixation et intégrateur paradigmatique 


L'analyse des parasynthétiques comme dérivés préfixés dans lesquels le suffixe est un 
intégrateur paradigmatique s'inscrit elle aussi dans la tradition binaire des régles de dé- 
rivation, mais récuse l'absence de pouvoir catégorisateur du préfixe. Elle a été proposée 
par Danielle Corbin, dont les travaux, au cours des 30 derniéres années du 20° siécle, 
ont impulsé à la morphologie dérivationnelle des évolutions théoriques fondamentales 
qui dépassent l'étude du lexique du francais auxquelles l'auteur se consacre. Danielle 
Corbin développe en effet avec sa thése (Corbin 1987) un systéme génératif de représen- 
tation du lexique construit qui s'éloigne du principe de concaténation de morphémes. 
Ce systéme comporte un composant dérivationnel qui utilise des Régles associatives de 
Construction de Mots (RCM); une RCM est un processus morphologique qui s'applique 
à une base. Les principes de fonctionnement des RCMs offrent de nouvelles perspectives 
pour l'analyse des dérivés parasynthétiques. Corbin (1980) déjà, repris ensuite dans Cor- 
bin (1987), défend l'idée que le préfixe a la faculté de produire des dérivés ayant une 
catégorie grammaticale différente de la base. 

Dans cette analyse, la construction morphologique de dératiser, donnée en (11), comme 
celle de désensibiliser, en (12), est réalisée en deux étapes : une préfixation en dé- sur base 
nominale ou adjectivale, suivie d'une modification formelle affectant respectivement les 
séquences verbales obtenues dérat- et désensibl-, qui consiste en l'ajout, à la sortie du 
composant dérivationnel, du segment dépourvu de sens -is(er). 


(11) [dé- [rat]y +is(er) ke 
(12) [dé-[sensible]A +is(er) ]y 


La séquence suffixale, identifiée dans la notation ci-dessus par le signe « + », est nom- 
mée « intégrateur paradigmatique » car son róle est d'insérer le mot auquel elle s'ap- 
plique dans un paradigme, ici, la classe des verbes de changement d'état. Le Principe de 
Copie, auquel obéit l'emploi de cet intégrateur permet de donner au segment ajouté une 
fonction purement analogique : -is(er) est le suffixe verbal le plus disponible et corres- 
pond, le cas échéant, à la valeur du suffixe verbal utilisé dans la famille du préfixé (ex. 
sensibiliser). Le recours à ce principe ne suffit cependant pas à expliquer l'absence de 
copie pour certains verbes préfixés en dé- comme désherber. 
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3.5 Bilan 


Ces propositions d’analyse sont toutes motivées par la volonté de rendre compte du lien 
sémantique direct existant entre rat ou sensible, et le verbe préfixé apparenté. Chacune 
à sa manière, elles cherchent à représenter la séquence iser d'une manière permettant 
de court-circuiter le décalage sémantique : soit +is(er) est vidé de son sens et n'est plus 
qu'un marqueur catégoriel, soit les deux affixes se partagent les propriétés sémantico- 
catégorielles, soit encore ils fusionnent pour ne constituer qu'un seul morphéme. 


4 Parasynthése et RCL 


Les dérivés parasynthétiques font partie des structures dérivées qui tirent le bénéfice le 
plus substantiel de la démarche lexématique en morphologie, et plus particuliérement 
des innovations du modèle des Règles de Construction de Lexémes (RCL) tel qu'il est 
développé, motivé, détaillé, formalisé et largement illustré dans Fradin (2003). Dans cet 
ouvrage, la seule unité manipulée est le lexéme, objet pour lequel l'auteur développe sa 
propre définition à la suite, entre autres, des travaux de Anderson (1992), Aronoff (1976), 
Beard (1995), Carstairs-McCarthy (1992), Matthews (1974), Scalise (1984) qui chacun pro- 
posent une alternative à l'approche Item et Arrangement (Hockett 1954) dont relévent 
les modéles à base morphéme. La conception du lexéme est défendue à travers une série 
d'exemples que la section 4.1 ci-dessous résume briévement?. 


4.1 Lexéme et RCL : principes fondamentaux 


Fradin (2003) bâtit son modèle dans le cadre lexématique de la morphologie construc- 
tionnelle. Son originalité se manifeste à travers les propriétés suivantes : 


— Unlexéme est une unité tridimensionnelle, dont les rubriques, indépendantes entre 
elles, consignent ses propriétés formelles, syntaxico-catégorielles (ou « syntac- 
tiques » dans la terminologie de Fradin (2003), inspirée de celle de Mel'éuk (1993)), 
et sémantiques. 


— Lelexique obéit à une organisation hiérarchique, de sorte que, pour tous les lexémes, 
chaque dimension hérite, à travers un partage de type, des propriétés de l'élément 
qui la domine dans la hiérarchie. Cette organisation est inspirée des travaux de 
Koenig (1999) et Davis & Koenig (2000). 


— Contrairement à ce qu'implique la hiérarchisation lexicale chez Koenig (1999), le 
lexéme est une entité monosémique dont le contenu sémantique est entiérement 
spécifié. L'argument, repris dans Fradin & Kerleroux (2003a,b), est que les procédés 
constructionnels s'appliquent à des entrées lexicales sémantiquement non ambi- 
gués mais qu'ils forment des lexémes construits dont le contenu sémantique peut 
être sous-spécifié, la spécification de ce contenu dépendant d'autres facteurs. 


"Dans la suite du chapitre, nous représentons le lexéme en petites capitales, conformément à la notation 
proposée par Matthews (1974). 
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— Une RCL met en relation un lexéme de base (ou deux lexémes de base, dans le 
cas de la composition), et un lexéme construit. Cette relation se traduit par l'appli- 
cation simultanée, sur chacune des trois composantes constitutives des lexémes 
concernés, d'un faisceau de fonctions indépendantes les unes des autres. 


— Une attention particuliére est portée à l'expression formelle des RCL, et, en par- 
ticulier, au traitement sémantique. Celui-ci fait appel à un formalisme combinant 
logique et A-calcul qui permet de représenter le contenu sémantique des patrons 
de lexémes en entrée et sortie des RCL, et des lexémes qui instancient ces patrons. 
(voir infra les figures 1 à 3). 


La principale rupture à laquelle le modéle conduit vis-à-vis des systémes théoriques 
qui l'ont précédé, notamment Corbin (1987), réside en la description en trois niveaux 
de la relation établie par une RCL entre les lexémes qu'elle connecte. Tout en s'affran- 
chissant de l'assemblage de morphémes imposé par les théories relevant du cadre Item 
et Arrangement, ce principe rend également obsolètes les règles associatives, le Principe 
de Copie et le Principe d'Unicité sémantique des procédés morphologiques de Corbin 
(1987) et supprime ainsi la nécessité de recourir aux « mots possibles » comme des étapes 
indispensables dans l'analyse constructionnelle de certains dérivés. Cette propriété fon- 
damentale des RCL ouvre des perspectives nouvelles dans l'analyse des constructions 
parasynthétiques, comme nous le détaillons en section 4.2 à travers les exemples des 
verbes DÉRATISER et DÉSENSIBILISER. 


4.2 Analyse des verbes en dé-X-iser 


L'analyse des verbes vérifiant le patron dé-X-iser dans Fradin (2003), que l'on retrouve 
aussi dans Fradin (1997a) à propos des préfixés adjectivaux en anti-, démontre la néces- 
sité de déconnecter les opérations formelles et sémantiques des RCL : l'interprétation de 
dé-X -iser fait intervenir le sens de X (qu'il soit adjectival, comme dans DRAMATIQUE — 
DÉDRAMATISER OU SENSIBLE — DÉSENSIBILISER, ou nominal, comme dans RAT — DERATI- 
SER), alors que sa forme est motivée par celle de X-iser. La solution de Fradin (2003 : 297) 
consiste à faire du verbe X-iser l'entrée de la règle de préfixation en dé-. Il s’agit donc 
d'une relation de préfixation entre deux verbes, la base étant formellement suffixée en 
-iser. Une seule et méme RCL s'applique quelle que soit la catégorie de X (nom ou ad- 
jectif) et quel que soit le sens de dé-X-iser : annulation de la propriété X (SENSIBLE — 
DÉSENSIBILISER), ou dissociation de la partie et du tout auquel cette partie est attachée 
initialement (RAT — DÉRATISER OU NICOTINE — DENICOTINISER; dans ces deux cas, la 
partie dissociée est exprimée par X, X = RAT et X = NICOTINE respectivement). Les méca- 
nismes de cette RCL sont détaillés infra pour l'analyse de DÉSENSIBILISER et DÉRATISER. 
La méme RCL s'applique aux dérivés dé-X -iser pour lesquels le nom X dénote le tout qui 
sera privé de l'une de ses parties à l'issue du déroulement du procès décrit par dé-X -iser, 
comme DÉBUDGÉTISER ‘faire sortir du budget’. Dans tous les cas, l'objectif principal de 
l'analyse est de légitimer la présence du suffixe -iser. Les différences entre forme et sens 
dans le fonctionnement de la RCL sont dues au contenu sémantique de X-iser. Elles in- 
fluent naturellement sur le contenu sémantique du dérivé dé-X-iser produit par la RCL. 
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Quand la RCL est appliquée à SENSIBILISER (figure 1), le patron sémantique de X-iser 
décrit un prédicat d'accomplissement (représenté par la primitive factitive « CAUSE ») 
conduisant à un changement d'état (représenté par la primitive « become »). L'argument 
patient du verbe X-iser qui subit le changement d'état est représenté par la variable y 
et l'agent (i.e. le causateur) par la variable x. Le contenu sémantique de X-iser fait inter- 
venir un prédicat représenté par la variable P’ et appliqué à y : il s'agit de la propriété 
caractérisant le référent de y, c'est-à-dire le contenu sémantique de X. On voit par là que 
le sens de X-iser fait clairement apparaitre celui de sa base X ce qui le rend disponible 
pour la construction du sens de dé-X -iser. La notation formelle adoptée par Fradin (2003) 
dans la rubrique sémantique des lexémes met en évidence la combinaison des différents 
maillons (opérateurs, prédicats et primitives sémantiques) qui construisent le sens d'un 
lexéme, en particulier quand il est morphologiquement construit. 


X-iser 
form | F 
cat V, StrArg = <SN,SN> 
sem | AyAxdAe(CAUSE(e, x, (become P'(y))) 
dé-X -iser 
deF 
V, StrArg = <SN,SN> 
AyAxAe(CAUSE(e, x, (become NOT(P’(y)))) 


d 


Figure 1: RCL: X-isery — dé-X-iser, où X est un adjectif et F est la forme de 
A -iser 


La sortie de la RCL décrit un prédicat également télique exprimant un changement 
d'état : la structure logique qui décrit le sens de dé-X-iser met en jeu les mémes primi- 
tives « CAUSE » et « become » que sa base. Mais on observe que, pour représenter la 
privation de la propriété P’ qui correspond au sens de l'adjectif X et qui qualifie y avant 
le début du procés, la RCL extrait le prédicat P'(y) intervenant dans la rubrique séman- 
tique de X-iser et lui applique l'opérateur de négation “NOT”. En d'autres termes, la RCL 
ne construit pas le sens de dé-X -iser à partir de celui de X-iser, mais bien directement à 
partir de celui de X. De cette manière, elle signale que la propriété annulée n'est pas né- 
cessairement le résultat d'un procédé antérieur (par exemple, « désensibiliser une dent » 
consiste à óter à la dent la sensibilité à la douleur, qui est une propriété physiologique 
inhérente des parties du corps). L'emploi d'une représentation formelle du sens permet 
ainsi à Fradin (2003) de connecter directement la structure sémantique de pref-X -suf au 
prédicat exprimant le sens de X (i.e. P’(y) quand X désigne une propriété adjectivale), au- 
quel la RCL accéde à travers la combinaison de primitives qui définissent X-iser. Ce que 
par ailleurs sous-entend cette représentation, c'est l'existence d'un premier chainon per- 
mettant d'expliquer le fonctionnement de la RCL. En d'autres termes la RCL ne connecte 
pas deux lexémes, mais trois, ce que l'on pourrait représenter par le patron de la figure 2, 
le sens de X motivant à la fois celui de X-iser et celui de dé-X -iser. 
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X X-iser 
form | F — | Fiz — 
cat A — | V, StrArg = <SN,SN> — 
sem | AyP’(y) | — | ÀAyAxAe(CAUSE(e, x, (become P’(y)))) | — 
dé-X -iser 
deFiz 


V, StrArg = <SN,SN> 
AyAxAe(CAUSE(e, x, (become NOT(P’(y))))) 


Figure 2 : Combinaison de deux RCL : X4 — X-iseny — dé-X-isery où F est la 
forme de X 


Pour l'analyse de DERATISER, le méme schéma de régle est appliqué : il connecte de 
facon implicite X et X-iser, et de facon explicite X-iser et dé-X-iser. Comme pour DE- 
SENSIBILISER, le verbe préfixé est traité comme un simple cas de suffixation suivi d'une 
préfixation (Fradin 2003 : 298), à ceci près que le maillon intermédiaire X -iser, ici RATISER 
est pragmatiquement peu plausible. À l'image de la figure 2, nous reprenons en figure 3 
le schéma de (Fradin 2003 : 297), en le modifiant pour faire apparaitre le róle de X. La 
figure montre que le contenu sémantique de X-iser décrit la localisation du référent de 
X (Ay rat' (y)) sur ou dans ce que dénote le patient z du verbe; le mécanisme qui conduit 
à la construction du contenu sémantique de dé-X-iser procède comme pour DESENSIBILI- 
SER : le sens de DÉRATISER, qui décrit l'état final du référent du patient z, débarassé de ce 
que dénote X, n'est pas élaboré à partir du contenu sémantique de X-iser, mais exploite 
directement le prédicat Ay rat’(y) qu'il extrait de la rubrique sémantique de ce verbe. 
L'analyse convoque ici un raisonnement légèrement différent puisque l'attestation de 
X-iser est optionnelle. Cette étape, motivée sémantiquement, est également justifiée par 
l'uniformisation du traitement des pref-X -suf. Nous verrons d'ailleurs, dans la section 6, 
que l'analyse proposée dans le cadre de ParaDis intégre explicitement X dans l'analyse de 
DÉRATISER et DÉSENSIBILISER, en y incluant les relations dérivationnelles binaires dans 
des modules qui donnent accés à une partie de la famille dérivationnelle des lexémes 
construits. 


X X-iser 
form | F — | Fiz > 
cat N — | V, StrArg = <SN,SN> — 
sem | AyP’(y) | — | ÀxAzAe3y(CAUSE(e, x, (LOC(y, in(z)) ^ P'(y)) | — 
dé-X -iser 
deFiz 


V, StrArg = <SN,SN> 
AxAzle3y(CAUSE(e, x, (NOT(LOC(y, in(z)) ^ P’(y)))) 


Figure 3 : Combinaison de deux RCL : Xy — X-isery — dé-X-isery où F est la 
forme de X 
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Puisqu'une RCL peut sélectionner de facon autonome les caractéristiques formelles 
d'un verbe et le contenu sémantique de l'adjectif base de celui-ci, cette solution constitue 
de fait un premier pas vers une conception en réseau de la morphologie dérivationnelle : 
la construction du préfixé est tributaire de la forme d'un membre de sa famille dérivation- 
nelle, et du sens d'un autre. Relativement aux analyses décrites dans la section 3, celle 
de Fradin (2003) repose sur un rapport formel uniforme, et transforme le « casse-téte » 
parasynthétique en une simple relation entre un lexéme verbal de base (éventuellement 
non attesté) et un lexéme verbal construit par préfixation. Elle régle par ailleurs le pro- 
bléme du décalage entre forme et sens gráce à l'utilisation de représentations formelles 
qui permettent de construire un sens approprié pour le dérivé à partir des éléments de 
sens pertinents présents dans la représentation sémantique de la base. 

Néanmoins l'analyse proposée n'est pas totalement satisfaisante : le traitement de DE- 
RATISER ou d'un néologisme verbal comme EMPUISSANTISER dans « Or il y a deux moyens 
d'empuissantiser les idées. » (citation de l'économiste Frédéric Lordon* entendue sur 
France Culture en 2015), impose de recourir à l'artifice qui consiste à reconstruire un 
verbe non attesté. Elle ne permet pas non plus de connaître a priori les formes que peut 
prendre la séquence suffixale, car la démarche est descriptive et à visée d'analyse : étant 
donné une forme vérifiant le patron pref-X -suf, la RCL permet d'en expliquer le sens et 
la forme. En revanche, le dispositif n'est pas congu pour rendre compte de la variation 
dans le nombre et la diversité des inputs possibles de la RCL, ni pour prédire le fait que 
plusieurs pref-X -suf synonymes peuvent étre construits à partir du méme X. En d'autres 
termes, les RCL ne permettent pas, par exemple, de décrire l'ensemble des mécanismes à 
l'origine de la régularité qui explique que ‘contre le cancer’ est une paraphrase du sens 
de ANTICANCER, ANTICANCÉREUX, ANTICANCERIGENE OU ANTIONCOLOGIQUE ni ceux qui 
font que ANTIVIBRATION, ANTIVIBRATOIRE, ANTIVRIBEUR, ANTIVIBRATEUR, ANTIVIBRA- 
TIF, ANTIVIBRANT, ANTIVIBRATILE sont autant de dérivés concurrents signifiant 'contre 
les vibrations’. Le principe fondamental des RCL qui consiste en une action indépen- 
dante et simultanée de leurs opérations constitutives est donc nécessaire, mais ne suffit 
pas à expliquer complétement les constructions dites parasynthétiques. 


4.3 Bilan 


Les principes théoriques défendus dans Fradin (2003) comportent des propositions cen- 
trales pour le modèle ParaDis, objet de la section 6. Certaines sont formulées de facon 
explicite : le lexéme supplante le morphéme comme unité de traitement dans la construc- 
tion du lexique; il s'agit d'une unité tridimensionnelle sémantiquement spécifiée, dispo- 
sant d'un ensemble organisé de radicaux libres et supplétifs dont Fradin (2003 : 138-140) 
propose une première structuration relative à leur statut « libre » ou « savant » ; les RCL 
qui relient ces lexémes font intervenir des fonctions agissant de facon indépendante sur 
chacune des trois dimensions connectées. On verra que dans ParaDis cet aspect modu- 
laire de la construction lexicale est étendu aux relations entre les éléments du lexique. 


^Lordon, Frédéric (2016). Les affects de la politique, Seuil, Paris. 
?La requête Google "ratiser" ne ramène aucune page utile (08/10/2016). 
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Mais on montrera que l'élaboration de ParaDis profite également d’avancées du modèle 
de Fradin (2003) que l'auteur ne met pas en avant. D'une part, son analyse des parasyn- 
thétiques met en jeu une morphologie en réseau qui ne dit pas son nom : le fait que la 
RCL qui construit le dérivé dé-X-iser puisse utiliser directement la sémantique de l'ad- 
jectif base X du verbe X-iser suppose que la relation dérivationnelle entre X-iser et sa 
propre base X soit accessible, via la structure interne du verbe suffixé. L'analyse des pa- 
rasynthétiques, à travers l'exemple des verbes préfixés en dé-, montre que les procédés 
de construction ont accés, au delà du couple base/dérivé, aux autres membres de leur fa- 
mille dérivationnelle. D'autre part, méme si Fradin (2003) ne l'indique pas explicitement, 
la maniére dont la RCL organise la mise en relation entre deux membres d'une famille 
constructionnelle fait sauter les verrous de la nécessaire orientation lexéme(s) base(s) 
— lexème construit des procédés dérivationnels : dans la mesure où le mécanisme d'ap- 
plication d'une RCL n'impose aucune contrainte sur la complexité relative que doivent 
respecter la (ou les) base(s) et le construit connectés par la RCL, chacun de ces lexémes 
peut étre plus (ou aussi) complexe que l'autre, formellement, mais aussi sémantiquement. 
Finalement, les principes d'indépendance des fonctions qui constituent les RCL ouvrent 
la voie à des analyses mettant en jeu des relations constructionnelles a-directionnelles 
et bi-directionnelles. 

De nombreux morphologues francophones ont adhéré aux idées défendues dans Fra- 
din (2003), et les ont fait évoluer. C'est ainsi que les années qui ont suivi la parution 
de cet ouvrage ont vu se développer de nombreuses analyses fondées sur le modéle de 
RCL, dont certaines étendent ses principes théoriques : en particulier, différents travaux 
se sont intéressés à la structure formelle du lexéme (Bonami & Boyé 2007), à l'incorpo- 
ration des radicaux supplétifs ou savants (Amiot & Dal 2005, Bonami et al. 2009), ou 
à leur extension à des thémes dérivationnels supplémentaires (Tribout 2012). Dans le 
méme temps, les RCL et la notion de lexéme suscitent des réactions et des critiques qui 
conduisent à l'élaboration de travaux s'appuyant sur les principes qui ont émergé de ces 
confrontations. C'est ce que présente la section 5. 


5 Vers une morphologie dérivationnelle en réseau 


Le modèle des RCL constitue, comme nous venons de voir, un progrès déterminant dont 
les bénéfices pour l'analyse de la parasynthése sont importants. Comme nous l'avons 
évoqué supra (section 4.1), le cadre théorique développé par Fradin (2003) a constitué 
une référence forte pour la plupart des recherches en morphologie dérivationnelle qui 
ont été menées en France et ailleurs dans les années 2000. C'est notamment le cas des 
travaux réalisés à Toulouse au sein de l'axe DUMAL (“Des Unités Morphologiques Au 
Lexique") et plus généralement par les morphologues de l'ERSS. Le livre DUMAL (Roché 
et al. 2011) en propose une synthése. Nous présentons dans cette section les principes qui 
ont guidé ces travaux et les avancées qu'ils ont rendues possibles, en particulier dans 
l'analyse des parasynthétiques. 
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5.1 Variabilité des dérivés morphologiques 


Le cadre théorique de Fradin (2003) se caractérise par sa nature formelle qui le place 
dans la lignée des recherches menées au sein du laboratoire LLF. Les analyses dévelop- 
pées dans ce cadre portent essentiellement sur les aspects sémantiques de la dérivation 
morphologique. Fradin (2003) propose un systéme formel à la fois original par son uti- 
lisation du A-calcul pour la description du sens lexical, et relativement classique par le 
mécanisme d'héritage multiple de lexémes sous-spécifiés guidé par une structure hié- 
rarchique du lexique, en l'occurrence un treillis. La formalisation de la construction du 
sens dérivationnel est par certains aspects supérieure aux descriptions faites au moyen 
de paraphrases ou de gloses. On peut en effet considérer que la nature informelle de ces 
derniéres rend indécidables (ou irréfutables) les démonstrations qui les utilisent parce 
qu'elles empéchent toute preuve des propriétés et généralisations avancées. Cependant, 
la linguistique n'est pas un système purement formel mais manipule un matériau naturel, 
ce qui fait que l'instanciation des variables et des prédicats lors de l'interprétation des 
représentations formelles constitue un passage dans l'informel qui entache et affaiblit 
les démonstrations. Par ailleurs, la nature formelle et explicite de ces descriptions pré- 
sente quelques limites qui expliquent probablement que ces aspects du modèle de Fradin 
(2003) n'ont pas recu le méme niveau d'adhésion que la description des dérivations au 
moyen de RCL. La description du sens dans le formalisme du A-calcul comporte en effet 
différentes faiblesses : 


— elle est peu familiére à la plupart des morphologues; 


— les représentations sont difficiles à construire, à combiner, à exploiter manuelle- 
ment ou par programme : peu de morphologues sont capables de les « faire fonc- 
tionner » ; 


— elle est trop rigide pour rendre compte de la variété des propriétés sémantiques 
impliquées dans la construction morphologique et de la plasticité du sens des dé- 
rivés. 


Fradin (2003) est ainsi amené à multiplier les instructions sémantiques. Ces instructions 
sont disjointes et le cadre formel ne prévoit aucun mécanisme simple permettant d’ex- 
primer leur similarité comme dans le cas de la suffixation en -ette où les dérivés féminins 
(ex. FLIC — FLIQUETTE dont le sens est construit au moyen de l’instruction sémantique 
(13)) et les noms de lieu déverbaux (ex. COUCHE — COUCHETTE dont le sens est construit 
au moyen de (14)) ne partagent aucune propriété (Fradin et al. 2003). 


(13) AyAP’, P'(y) ^ femelle'(y) 
(14) AyAxAP'3e, P' (e, x, ...) ^ dans’ (e, y) ^ temporaire’ (e) 

Les morphologues toulousains ont proposé différents aménagements du cadre de Fra- 
din (2003) pour répondre à ces limitations, et notamment disposer d'une souplesse adap- 


tée àla variabilité sémantique, formelle et catégorielle de la construction morphologique. 
Ils ont repris le principe fondamental de dissociation entre les représentations formelles, 
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catégorielles et sémantiques proposé dans Fradin (2003 : 9), indispensable pour rendre 
compte des décalages entre forme et sens puisqu'il permet de faire coopérer plusieurs 
lexémes pour former un dérivé en utilisant la forme de l'un et le sens d'un autre. La 
notion de “base” se retrouve ainsi redéfinie et correspond au lexéme qui motive séman- 
tiquement le dérivé. C'est le cas par exemple dans la construction de l'adjectif PIANIS- 
TIQUE, dont la forme est construite relativement à PIANISTE, mais dont l'interprétation, 
dans un énoncé comme « un concerto/une sonate/une sonorité pianistique » renvoie 
directement au contenu sémantique du nom PIANO. Ce découplage a été mis en œuvre 
de deux maniéres différentes. Roché (2009) a proposé de considérer que la construction 
d'un lexéme dérivé se compose d'un ensemble d'opérations phonologiques, syntaxiques 
et sémantiques (élémentaires) indépendantes. Il n'y a aucune contrainte a priori sur cet 
ensemble sinon qu'il ne doit pas étre vide. Il propose par exemple de considérer la déri- 
vation RAT — DÉRATISERÓ comme étant composée de quatre opérations élémentaires : 


1. une opération catégorielle N — V; 
2. une opération formelle de suffixation en -iser qui signale l'opération catégorielle ; 


3. une opération sémantique ‘rat’ — 'éliminer toutes les instances de rat qui sont 
dans un lieu’; 


4. une opération formelle de préfixation en dé- qui signale l'opération sémantique. 


Les dérivés parasynthétiques comme dans PARLEMENT — ANTIPARLEMENTAIRE peuvent 
étre analysés strictement de la méme maniére : une opération catégorielle (N — A), 
une opération sémantique CP — ‘qui est contre P") et deux opérations formelles (une 
préfixation en anti- et une suffixation en -aire), la préfixation signalant l'opération sé- 
mantique et la suffixation l'opération catégorielle. Cette proposition n'a pas été élaborée 
davantage et Roché (2009) ne dit rien des contraintes qui portent sur ces outputs ni sur 
les associations entre ces contraintes. 

Hathout (2011) propose une autre mise en ceuvre, plus élaborée, fondée sur un mo- 
déle à quatre niveaux de représentation et sur plusieurs jeux de contraintes. Certaines 
portent sur les représentations de l'un des quatre niveaux tandis que d'autres sont desti- 
nées à contróler la correspondance entre les représentations des niveaux phonologique, 
syntaxique et sémantique avec celles du niveau lexical. Dans ce modèle, une grande 
partie des contraintes sont exprimées en termes d'analogie et la construction morpho- 
logique est vue comme le calcul d'une solution optimale relativement à l'ensemble des 
contraintes qui portent sur les lexémes qui participent à cette opération construction- 
nelle. Le découplage des quatre niveaux fournit au systéme les degrés de liberté néces- 
saires pour rendre compte de l'association d'une méme forme construite à plusieurs sens 
comme antipaternel, qui peut signifier ‘contre les péres' ou ‘relatif aux antipéres’, et la 
multiplicité des formes qui peuvent exprimer un méme sens comme antivibration, anti- 
vibratoire, antivibrant, antivibreur, etc. qui toutes peuvent étre associées au méme sens 


SRappelons que dans l'analyse de DÉRATISER de Fradin (2003), l'intervention du nom RAT est seulement sous- 
entendue, méme si la représentation que nous en donnons dans la figure 3 la fait apparaître de manière 
explicite. 
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construit ‘contre les vibrations’. Cette proposition est en grande partie reprise dans Pa- 
raDis. 

On le voit, l'assouplissement du cadre défini par Fradin (2003) passe par le rempla- 
cement des représentations formelles par des ensembles de contraintes. Inspirées de la 
Théorie de l'Optimalité (Prince & Smolensky 1993; McCarthy & Prince 1993), elles sont 
contradictoires et violables. Initialement définies sur les caractéristiques morphopho- 
nologiques, comme les contraintes dissimilatives (Plénat 2011), et prosodiques (Plénat 
2009b), elles ont ensuite été étendues à des propriétés plus structurelles, portant sur les 
familles et les séries dérivationnelles (Hathout 2011; Roché & Plénat 2014). Le modéle 
ParaDis que nous détaillons dans la section 6 reprend à la fois ce principe de contróle des 
constructions morphologiques par des contraintes et la représentation formelle du sens. 
Cette formalisation sémantique semblable à celle qui est utilisée dans la base de données 
morphologique Démonette (Hathout & Namer 2014a; Hathout & Namer 2016) différe de 
celle de Fradin (2003). Elle comporte d'une part un typage sémantique des variables qui 
représentent le sens des lexémes en jeu dans une construction donnée, et d'autre part 
une représentation formelle des relations de sens qui existent entre ces lexémes. 

Sur le plan méthodologique, les morphologues toulousains ont défendu et utilisé une 
approche extensive (Plénat et al. 2002; Hathout et al. 2003, 2008, 2009) qui, complète- 
ment en accord à leur intérét pour la variation et la variabilité, consiste à collecter le plus 
grand nombre possible d'attestations et d'exemples des phénoménes étudiés, notamment 
sur la Toile, et à proposer des analyses rendant compte de l'ensemble des données collec- 
tées. La démarche extensive a notamment été utilisée par Hathout et al. (2009), Hathout 
(2011) pour l'analyse de la préfixation en anti- (voir section 5.3). Elle a permis de mettre au 
jour des dérivés inattendus comme ANTIDÉSHERBANT, dérivé sur HERBE et synonyme de 
DÉSHERBANT. Ce lexéme est formé par trois opérations formelles : une suffixation en -ant 
qui signale l'opération catégorielle N — A et deux opérations formelles (une préfixation 
en dé- et une préfixation en anti-) qui signalent la méme opération sémantique. 


5.2 Inscription dans le lexique 


L'approche DUMALienne de la morphologie dérivationnelle (Roché et al. 2011) a aussi 
mis fortement en avant l'inscription de la morphologie dérivationnelle dans le lexique. 
Cette relation essentielle est notamment l'objet d'un important article de Michel Roché 
(2009). Outre les faits, aujourd'hui consensuels, que (i) l'une des fonctions de la morpho- 
logie dérivationnelle est de construire des mots capables d'entrer dans le lexique (mais 
ce n'est pas toujours le cas comme l'ont montré Dal & Namer (2016)) et (ii) la morpho- 
logie utilise le lexique comme une ressource dans laquelle elle trouve les bases et plus 
largement les lexémes dont elle a besoin, la construction morphologique est soumise à 
la pression du lexique existant. Cet état de fait permet d'expliquer des décalages excep- 
tionnels dus à la présence dans le lexique de mots à consonance proche comme dans la 
suffixation en -esque BAMBOU — BAMBOULESQUE, où l'épenthése en /l/, trés rare, est légi- 
timée par la présence dans le lexique du lexéme BAMBOULA et de son adjectif relationnel 
BAMBOULESQUE (Plénat 20092). 
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La prise en compte du lexique existant n'est pas en elle-méme une innovation. Elle 
fonde notamment le Principe de Copie introduit par Dell (1970) et repris par Corbin 
(1980, 1987) qui, comme nous l'avons rappelé supra (section 3.4), est utilisé pour expliquer 
la sélection du suffixe dans les dérivés parasynthétiques. Ce principe a été étendu par 
Roché (2007) en un Principe d'Économie qui stipule que la « langue [tend à réutiliser] 
une forme déjà existante dans le paradigme dérivationnel , en violation de l'instruction 
propre à l'affixe [plutót que de construire une forme nouvelle] ». 

Ces deux principes sont destinés à préserver et renforcer les régularités qui existent 
dans le lexique —ou la simplicité du lexique dans les termes de Dell (1970), régularités qui 
en déterminent l'organisation morphologique. L'approche toulousaine se distingue net- 
tement de celle de Fradin (2003) sur ce plan. Comme nous l'avons indiqué en section 4.1, 
ce dernier adhére en effet à une conception hiérarchisée du lexique ot les diverses caté- 
gories sont reliées par des relations d'héritage multiple (voir aussi Koenig (1999), Davis & 
Koenig (2000)). À l'inverse, les structures envisagées au sein de l'axe DUMAL sont d'une 
nature plus paradigmatique, et s'inscrivent dans un cadre « orienté output » alors que 
Fradin (2003) est l'héritier des traditions génératives « orientées input », méme si son 
modèle a joué un rôle de tremplin qui permet de s'en détacher. Ainsi, dans l'approche 
développée à l'ERSS, la pression du lexique existant s'exerce dans des directions définies 
par deux types de structures : les familles dérivationnelles et les séries dérivationnelles. 
Si la notion de famille dérivationnelle, traditionnellement appelée « famille morpholo- 
gique », est bien connue, elle ne joue aucun róle dans les modéles théoriques antérieurs 
de la morphologie dérivationnelle. Sa formalisation est initiée dans Hathout (2011) qui 
en fait le fondement du modèle qu'il propose. Une famille dérivationnelle regroupe un 
ensemble de mots connectés par des relations de construction morphologique (ex. la 
famille dérivationnelle de LAVER contient les mots qui lui sont reliés directement ou in- 
directement : LAVEUR, LAVEUSE, LAVOIR, LAVAGE, LAVERIE, LAVETTE, DÉLAVER, etc.); une 
série réunit un ensemble de mots du lexique formés par un méme procédé dérivationnel 
(ex. par la suffixation en -able). Ces structures sont essentielles pour l'analyse des déri- 
vés parasynthétiques car elles donnent accés aux différents lexémes impliqués dans leur 
construction. Ces lexémes guident l'opération constructionnelle et lui fournissent les 
éléments de forme et de sens dont elle a besoin. Familles et séries sont à la base de l'orga- 
nisation paradigmatique du lexique dérivationnel. À un niveau relationnel, l'inscription 
de la morphologie dérivationnelle dans le lexique permet de rendre compte du fait que 
les relations constructionnelles forment des analogies qui connectent des séries dériva- 
tionnelles. Ces connexions s'agrégent dans des graphes qui définissent des paradigmes 
dérivationnels comme nous le détaillons dans la section 6. 


5.3 Améliorations dans l'analyse des parasynthétiques 


Hathout (2011) ébauche un modèle de la morphologie dérivationnelle qu'il utilise pour dé- 
crire la dérivation en anti-, et notamment les correspondances multiples entre formes et 
sens. Il propose notamment que, lors de la construction d'un dérivé, le radical soit choisi 
dans un ensemble étendu qui contient les thémes de la base, mais également ceux de 
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tous les autres membres de sa famille dérivationnelle. Les propriétés du dérivé et de ses 
relations constructionnelles permettent de sélectionner un théme optimal qui convient 
à la plus forte des coalitions de contraintes capable de se constituer. Celles-ci portent sur 
les caractéristiques morphophonologiques de la forme des lexémes (phonation, dissimi- 
lation, taille), sur leur intégration dans le lexique existant (maximalisation de la ressem- 
blance avec les formes présentes, inclusion dans la famille et la série dérivationnelles), 
sur la transparence sémantique et catégorielle, etc. Par ailleurs, ce modèle prédit que cette 
sélection dépend de l'importance accordée par le locuteur à chacune des contraintes et 
que cette pondération varie en fonction du contexte dans lequel le dérivé est utilisé. 
C'est ainsi que l'on peut observer au moins neuf dérivés en anti- formés sur VIBRATION 
(ANTIVIBRATION ; ANTIVIBRANT ; ANTIVIBRATOIRE ; ANTIVIBRATIF ; ANTIVIBRATILE ; ANTI- 
VIBRATEUR ` ANTIVIBREUR ` ANTIVIBRABLE ; ANTIVIBRE). Il prédit également que, si les lo- 
cuteurs peuvent ponctuellement favoriser l'une ou l'autre de ces contraintes, la structure 
paradigmatique du lexique existant partagée par l'ensemble de la communauté exerce 
une pression forte qui permet de prédire lequel des lexémes en concurrence sera le plus 
fréquemment choisi. Par exemple, dans la compétition entre les lexémes ANTICANCER, 
ANTICANCEREUX et ANTICANCERIGENE, qui tous signifient ‘contre le cancer’, ANTICAN- 
CER est préféré aux deux autres car il satisfait presque toutes les contraintes identifiées 
par Hathout (2011) : 


— sa forme a une structure CV satisfaisante ; 
— elle ne comporte pas de problémes de dissimilation; 
— ]a taille de son radical est optimale (deux syllabes); 


— sa forme permet une identification optimale de la base (sémantique), de la famille 
et de la série dérivationnelle du dérivé; 


* le lexème est sémantiquement transparent; etc. 


Seule la contrainte de transparence catégorielle est enfreinte, car ANTICANCER, qui est 
un adjectif, a une forme de nom que lui confére sa finale cancer, le lexique du francais 
ne contenant que très peu de formes adjectivales finissant en /sex/. À l'inverse, les deux 
autres concurrents satisfont cette contrainte puisque leurs radicaux sont des formes ad- 
jectivales et que leurs finales (/a/et /3en/) sont fréquentes parmi les adjectifs construits. 
Des deux, ANTICANCÉREUX est clairement préféré par les locuteurs à ANTICANCERIGENE, 
parce qu'il satisfait davantage une des contraintes fortes du systéme, à savoir la trans- 
parence sémantique : en effet, la similarité interprétative de CANCER est plus forte avec 
CANCÉREUX qu'avec CANCERIGENE. 


5.4 Bilan 


L'approche de la morphologie dérivationnelle développée à l'ERSS se distingue ainsi 
nettement de celle que propose Fradin (2003) : elle est orientée output, met en place une 
architecture paradigmatique où famille et série dérivationnelles complètent la notion 
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de lexéme, s'inscrit dans le lexique, prend en compte la pression du lexique existant et 
définit un ensemble étendu de contraintes qui contrólent la construction morphologique 
tout en donnant au modèle suffisamment de souplesse pour rendre compte de la plasticité 
du sens et des variations formelles. Ces propriétés en font un cadre particuliérement 
bien adapté à l'analyse des dérivés parasynthétiques. Une derniére différence avec Fradin 
(2003) concerne l'attitude de l'axe DUMAL vis-à-vis de la formalisation du sens. Sur ce 
plan, Fradin (2003) correspond davantage au canon de la recherche en linguistique. En 
revanche, Fradin (2003) et Roché et al. (2011) se rejoignent sur l'organisation tripartite 
du lexéme, des RCL et des relations dérivationnelles. Les principales propositions de ces 
deux conceptions de la morphologie dérivationnelle sont intégrées au modéle ParaDis 
qui les articule dans une organisation radicalement paradigmatique. 


6 La dérivation modulaire dans le modéle ParaDis 


Comme nous venons de le faire pour les théories qui l'ont précédé en morphologie 
constructionnelle, nous présentons le modèle ParaDis (Paradigms and Discrepancies) en 
montrant comment il permet d'analyser, d'expliquer et de prédire la construction des 
lexémes et notamment de ceux qui, à l'image des dérivés parasynthétiques, dérogent 
aux principes canoniques de transparence formelle et de compositionnalité sémantique. 
ParaDis est une synthése entre un ensemble de propositions qui incluent les triangles 
proposés dans Lignon et al. (2014), les patrons cumulatifs de Bochner (1993) ainsi que les 
deux courants de la morphologie développés en France qui viennent d'étre présentés : 
l'approche défendue dans Fradin (2003) fondée sur la notion de lexéme et l'indépen- 
dance des opérations qui affectent chacune des trois dimensions constitutives des RCL, 
et l'approche développée à Toulouse au sein de l'ERSS (Roché et al. 2011) qui se fonde sur 
l'observation de données authentiques et qui próne une conception en réseau de la mor- 
phologie dont les mécanismes reposent sur la compétition des outputs arbitrée par un 
jeu de contraintes étendu, plutót que sur l'application de régles. ParaDis intégre ces diffé- 
rentes propositions dans un cadre paradigmatique de la morphologie dérivationnelle et 
s'inscrit dans la lignée de Roché (2009, 2010, 2011b), Plénat & Roché (2012) ou Hathout 
(2008) dont les analyses intégrent les notions de série et famille dérivationnelles. La sec- 
tion 6.1 propose un bref rappel de ces notions et plus généralement de celle de paradigme. 
Nous présentons ensuite ParaDis en section 6.2 et illustrons son fonctionnement sur des 
exemples de dérivation parasynthétique. 


6.1 Paradigmes dérivationnels 


La notion de paradigme est fortement associée à la morphologie flexionnelle ot elle a 
été définie clairement par des auteurs comme Wunderlich & Fabri (1995 : 266) : 


A paradigm is an n-dimensional space whose dimensions are the attributes (or fea- 
tures) used for the classification of word forms. In order to be a dimension, an attribute 
must have at least two values. The cells of this space can be occupied by word forms 
of appropriate categories. 
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ou comme Carstairs-McCarthy (1994 : 739) qui propose de distinguer les traits qui défi- 
nissent les cellules des paradigmes (et qu'il nomme « paradigmes abstraits ») des formes 
qu'elles contiennent (et qu'il nomme « paradigmes concrets ») : 


Let us call the abstract notion ‘paradigm,’ and the more concrete one ‘paradigm,’, 
and define them as follows : 


(1) PARADIGM, : the set of combinations of morphosyntatic properties or features 
(or the set of ‘cells’) realized by inflected forms of words (or lexemes) in a given 
word-class (or major category or lexeme- class) in a given language. 


(2) PARADIGM, : the set of inflectional realizations expressing a paradigm1 for a 
given word (or lexeme) in a given language. 


L'approche paradigmatique est devenue tout à fait standard voire dominante en mor- 
phologie flexionnelle (Stump 2001, 2006a,b, Ackerman et al. 2009, Baerman et al. 2010, 
Bonami & Stump 2016, Stump & Finkel 2013). Ce développement a été rendu possible 
par l'acceptation des modéles morphologiques basés sur les mots comme ceux de Ble- 
vins (2003-12, 2006). Dans ces modèles, les formes fléchies sont vues comme des réalisa- 
tions d'un lexéme et non plus comme des formes générées par des ensembles de régles 
opérant sur une forme de base. Cela permet de recentrer les études sur les relations qui 
existent entre les lexémes et leurs formes fléchies et de regrouper dans des paradigmes 
les lexémes qui partagent les mémes relations avec leurs formes. 

La situation est en revanche nettement différente en morphologie dérivationnelle où 
il n'existe pas de consensus sur le concept de paradigme. Certains comme Stump (1991) 
proposent de transposer à la dérivation les définitions établies pour la flexion, mais cette 
traduction ne va pas de soi et la question de l'élaboration d'une définition mieux adaptée 
à la dérivation demeure. Il existe en effet des différences notables entre flexion et dériva- 
tion. En particulier, comme le rappelle Stump (2001), la correspondance entre forme et 
sens n'intervient pas en flexion alors qu'elle est centrale en dérivation; de plus, la régu- 
larité et la cohérence paradigmatique est intrinséquement plus grande en flexion qu'en 
dérivation (Pounder 2000, Stekauer 2014). 

Ceci dit, la notion de paradigme connait depuis quelques années un intérét grandis- 
sant en morphologie dérivationnelle (Stekauer 2014, Boyé & Schalchli 2016). Les morpho- 
logues qui travaillent dans cette approche s'intéressent notamment à la dimension para- 
digmatique de la dérivation, à la définition de modéles morphologiques paradigmatiques 
et au rapprochement de l'organisation de la morphologie flexionnelle et de la morpho- 
logie dérivationnelle (Van Marle 1985, Stump 1991, Bochner 1993, Booij 1996, Pounder 
2000, Hathout et al. 2009, Roché 2009, Hathout 2011, Roché 2011b, Roché & Plénat 2014, 
Strnadová 2014a,b). Ainsi, certains de ces auteurs comme Van Marle (1985), Stump (1991), 
Pounder (2000) conçoivent les paradigmes dérivationnels comme de simples extensions 
des paradigmes flexionnels. Les paradigmes dérivationnels se distinguent cependant des 
paradigmes flexionnels par exemple parce qu'ils peuvent rendre compte des régulari- 
tés sémantiques de dérivés construits par des affixations concurrentes comme les noms 
d'agents en francais en -eur (VOLEUR), -ant (REPRÉSENTANT) ou -iste (JOURNALISTE) dont 
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les propriétés sémantiques sont similaires et qui entretiennent des relations analogues 
avec les membres de leurs familles dérivationnelles respectives. 

Par ailleurs, les paradigmes dérivationnels ont été, dans le sillage des modéles word- 
based, une réponse à la conception générative de la construction morphologique et à 
ses règles binaires et orientées. Les modèles paradigmatiques mettent en jeu des rela- 
tions dérivationnelles qui peuvent étre orientées dans les deux sens (base — dérivé ou 
dérivé — base) ou ne pas étre orientées du tout (Jackendoff 1975). D'autres part, ces 
relations ne sont pas limitées aux couples base-dérivé. Les paradigmes dérivationnels 
sont ainsi particuliérement adaptés à la description des relations transversales (cross- 
formations) qui caractérisent par exemple les couples de dérivés en -isme et en -iste, ou 
les affixations multiples, par exemple en -isation ou en -ologique (Lasserre & Montermini 
2014), etc. 

Les paradigmes dérivationnels sont des réseaux de mots interconnectés qui repro- 
duisent les motifs (i.e. les régularités) formés par les nombreuses relations, de toute 
nature, que chacun des membres du paradigme entretient avec les autres. Ces réseaux 
s'agrégent au sein des familles dérivationnelles, se superposent pour former des séries 
dérivationnelles connectées au sein d'analogies (Skousen 1989, 1992, Krott et al. 2001, Dal 
2003, Blevins & Blevins 2009, Arndt-Lappe 2015). Pour certains auteurs comme Stump 
(1991) ou Spencer (2013), les paradigmes dérivationnels décrivent des relations formelles 
entre deux classes sémantiques tandis que Stekauer (2014) propose qu'ils s'organisent 
autour de catégories cognitives. La plupart des auteurs considérent que les paradigmes 
se composent de relations qui impliquent plus de deux éléments (Van Marle 1985, Booij 
2010) méme si pour certains, comme Spencer (2013), ils ne contiennent que des relations 
binaires. 


6.2 Les principes de ParaDis 


Le modèle ParaDis n'est pas une formalisation directe des paradigmes dérivationnels, 
mais plutót un systéme qui met en jeu un ensemble de dispositifs permettant d'envisager 
les procédés constructionnels sous l'angle de leur dimension paradigmatique. L'architec- 
ture de ParaDis articule ainsi deux principes : la séparation des niveaux de description 
des lexémes, et la conception modulaire de la construction morphologique. Le premier 
s'inscrit dans la droite ligne de l'analyse de Fradin (2003) : le lexéme est une entité tri- 
dimensionnelle; les trois dimensions fonctionnent de façon simultanée et indépendante 
dans chaque régle de construction. Le second correspond à un changement d'échelle : 
l'unité de traitement est étendue à un (sous-)ensemble des membres de la famille déri- 
vationnelle du couple base-derivé, ce qui donne au systéme la capacité d'analyser des 
constructions pour lesquelles la forme et le sens ne sont pas coordonnés, et notamment 
les formations parasynthétiques (section 4), mais aussi la concurrence affixale, comme 
dans le cas de la formation des noms de plantation dont la base dénote une plante (ex. 
CERISE —> CERISAIE VS CERISE —> CERISERAIE) ou les formations rivales d'adjectifs déno- 
minaux en anglais en -ic et -ical (ex. HISTORY — HISTORIC vs HISTORY — HISTORICAL étu- 
diés par Lindsay & Aronoff (2013)), ou encore de schémas dérivationnels polysémiques, 
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comme celui auquel appartiennent les adjectifs en -istique, comme FOOTBALLISTIQUE qui 
signifie ‘relatif au football’ ou ‘relatif aux footballeurs’ (voir Strnadová (2014b) pour une 
analyse des adjectifs dénominaux dont l'interprétation est ambigué). 


6.3 Quatre composants 


La différence essentielle entre ParaDis et les modéles morphologiques lexeme-based est 
l'unité descriptive du mécanisme constructionnel. C'est le couple formé par un dérivé et 
sa base dans le courant lexématique de la morphologie, alors que dans ParaDis il s'agit 
du module, un dispositif qui opére au niveau du réseau de lexémes. La notion de module 
s'inspire des Patrons Cumulatifs introduits dans Bochner (1993), qui propose de fusion- 
ner en patrons n-aires des schémas de lexémes réguliérement connectés entre eux. Ces 
patrons résultent du recouvrement —autrement dit, du cumul-— de relations élémentaires 
entre schémas de lexémes partagés. Ces relations sont comparables à des RCL non orien- 
tées, en ce qu'elles inter-définissent collectivement les propriétés des lexémes qu'elles 
mettent en relation. Ainsi, le patron cumulatif (15) qui exprime la relation ternaire qui 
connecte de facon réguliére les noms d'idéologie en -isme, les noms d'adeptes en -iste et 
l'objet valorisé, est-il le produit de la superposition des structures binaires (16), (17) et (18), 
chacune exprimant un fragment de module (ces exemples sont empruntés à Strnadová 
(20142)). En d'autres termes, comme pour Bochner (1993), un module est une structure de 
graphe connexe dont les sommets décrivent des ensembles de lexémes dont les éléments 
entretiennent des relations d'interprédictibilité. L'un des corollaires de cette définition 
est que tout sous-module est un module. 


IX) /Xisme/ /Xiste/ 
05 JIN], N | A 
‘Z | [‘mouvement favorisant Z’} |'qui relève de Z, du mouvement favorisant Z’ 
IX) /Xisme/ 
(16) N |, N 
‘Z | l'mouvement favorisant Z’ 
/X/ /Xiste/ 
a) JIN], A 
‘Z | (qui relève de Z’ 
/X isme/ /Xiste/ 
(18) N |, A 
uy “qui reléve de Y' 


L'une des différences entre le formalisme de Bochner (1993) et ParaDis est que dans 
ce dernier le fonctionnement modulaire se distribue suivant quatre niveaux de descrip- 
tion lexicale, de sorte qu'un module se définit comme le produit de quatre composants 
interconnectés ayant chacun la structure d'un graphe connexe : 


CS : Le composant sémantique est un réseau de classes sémantico-conceptuelles qui 
décrit la maniére dont celles-ci interagissent. 
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CF : Le composant formel est un graphe de corrélations entre des schémas phonolo- 
giques ou graphémiques. 


CC : Le composant catégoriel connecte les parties du discours impliquées. 


CL : Le composant lexical réunit les lexemes d'une méme famille dérivationnelle qui 
vérifient l'ensemble des relations exprimées dans les trois autres composants. 


Autrement dit, un module est l'expression des relations morphologiques qui existent 
entre certains lexémes d'une méme famille dérivationnelle, examinées indépendamment 
et simultanément à chacun des quatre niveaux de représentation lexicale. Le réseau qui 
réalise le composant lexical peut étre considéré comme concret. Chacun des lexémes qui 
le composent instancie une description abstraite dans chacun des trois autres niveaux. 
Pour le dire autrement, le niveau lexical est celui des familles dérivationnelles, c'est- 
à-dire des réalisations concrétes, alors que les trois autres niveaux décrivent les séries 
dérivationnelles, sous forme abstraite. 

Un module structure les relations entre les membres d'une sous-famille dérivation- 
nelle en quatre plans descriptifs. La notion de composant lexical, et plus globalement 
celle de module, permet d'affiner la définition des familles dérivationnelles. Alors que 
traditionnellement, une famille est définie comme l'ensemble des lexémes partageant un 
méme ascendant, nous considérons dans ParaDis que deux lexémes appartiennent en ef- 
fet à la méme famille s'ils sont reliés par un chemin à travers un ou plusieurs composants 
lexicaux connexes. Une famille dérivationnelle devient ainsi une collection connexe de 
composants lexicaux. Prenons l'exemple du nom d'activité VIDAGE. Il entretient une rela- 
tion régulière avec le prédicat verbal VIDER dont il constitue la nominalisation de procès, 
et avec le nom vIDEUR, qui s'interpréte comme l'agent de cette activité, et dont la base est 
le méme verbe vIDER. Les trois lexémes entretiennent la méme relation paradigmatique 
que par exemple (19) ou (20). 


(19) BRACONNER, BRACONNEUR, BRACONNAGE, 


(20) COLLECTER, COLLECTEUR, COLLECTAGE 


Dans la terminologie de ParaDis, (VIDER, VIDEUR, VIDAGE) constitue le composant lexi- 
cal du module représenté dans la figure 4. Le paradigme qu'il décrit inclut également 
les triplets (19) et (20). Ce module est régulier : il implique des catégories sémantiques 
(conceptuelles) logiquement connectées —un prédicat (PRED) se nominalise en une acti- 
vité (ACT) et requiert un AGENT— et des schémas dérivationnels formellement interpré- 
dictibles : le théme Y du verbe utilisé en flexion pour construire les formes de l'imparfait 
l'est aussi pour construire les noms en -age et en -eur. Chaque sommet dans un des com- 
posants est connecté à un sommet au moins dans chacun des trois autres. La figure 4 
rend compte de la régularité paradigmatique qui caractérise le triplet (VIDER, VIDEUR, 
VIDAGE), qui se manifeste dans la géométrie isomorphe (ici, triangulaire) des structures 
qui réalisent les composants formel (CF), sémantique (CS) et lexical (CL). 

Pour alléger les graphiques des figures 4 à 8, le composant catégoriel n'est pas repré- 
senté explicitement. Nous avons indiqué sous forme d'indices dans le composant lexical 
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Figure 4 : Module correspondant à l'analyse de (VIDER, VIDEUR, VIDAGE). Le 
niveau catégoriel est omis. 


les catégories grammaticales auxquelles appartiennent les lexémes connectés. Les lignes 
continues représentent les connexions entre les éléments au sein d'un composant, et les 
lignes en pointillé relient les composants entre eux. La régularité de la construction de 
(VIDER, VIDEUR, VIDAGE) se traduit par une connexion doublement motivée dans le CL 
entre les éléments du triplet. Chacune de trois relations concrétes dans le CL est en effet 
l'instance de la relation abstraite correspondante dans les deux autres composants. 


6.4 L'analyse « ParaDisiaque » des adjectifs en anti-X -suf 


Nous avons montré que la dérivation parasynthétique était un modéle de préfixation ré- 
pandu dans les langues, fréquemment observable pour une grande variété de suffixes, et, 
comme a pu le montrer Hathout (2011), extrêmement productif. Pour un dérivé pref-X -suf, 
la marque suffixale suf coincide avec l'exposant de l'un des dérivés suffixés de X, i.e. 
X-suf quand celui-ci est attesté, témoignant ainsi du fait que, si pref-X-suf se définit par 
rapport à X, sa forme emprunte le segment suf au lexéme X-suf dérivationnellement ap- 
parenté à X. La modélisation du schéma de construction de ces formes doit donc inclure 
un dispositif d'accés aux membres de la famille de X. En nous servant de l'analyse de 
l'adjectif ANTIMILITARISTE, voyons comment ce mécanisme est réalisé dans ParaDis. 

La représentation d'ANTIMILITARISTE, dans la figure 5, se distribue suivant quatre di- 
mensions : c'est un adjectif; il instancie la classe conceptuelle d'opposition comme l'in- 
dique l'étiquette CONTRE dans le CS; il vérifie le patron formel átiXist dans le CF. Le 
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module d'ANTIMILITARISTE inclut dans son CL le nom MILITAIRE avec lequel ANTIMILITA- 
RISTE entretient une relation sémantiquement motivée : « une chanson antimilitariste » 
est ‘une chanson contre les militaires’, et plus généralement ‘une chanson contre l'ar- 
mée’. La connexion entre les deux lexémes est donc héritée du composant sémantique 
ou le concept CONTRE requiert nécessairement l'existence d'une entité (ENTITÉ) qui 
est l'objet de cette opposition. Cette relation est réguliére : toute entité (concréte ou abs- 
traite) peut déclencher une réaction d'opposition, et à toute attitude hostile correspond 
nécessairement l'objet rejeté. 

En revanche, il n'existe pas de justification formelle à cette relation : MILITAIRE dont 
la forme est une instance du patron Yeg (en considérant militaire comme formé sur le 
thème supplétif °/milit/ de ARMÉE), ne permet pas la prédiction de átiXist, et réciproque- 
ment. Il apparait ainsi un décalage entre la régularité sémantique et l'absence de lien 
formel entre ANTIMILITARISTE et MILITAIRE, ce qu illustre la figure 5 : la ligne continue 
qui connecte CONTRE et ENTITÉ dans le CS n'a pas de correspondant dans le CF. La mo- 
tivation sémantique justifie donc seule la relation qui unit, dans le CL, ANTIMILITARISTE 
et MILITAIRE. 


D ou — 
; ENTITÉ ` 
À \ 
| | CONTRE | _— ee | 
' 12 ; 
TE. B S pe 
e ee 
À LY 
x DH — M LLL 
` i p [o 
i p Yer 
` d | 
I ] 
; ; 
; 
à 
\ —— o pee = í 
‘ 54 S RE E 
` E oao 
( diüp ) 
Y 
=a ANTIMILITARISTE, = p 


Figure 5 : Élément de l'analyse de ANTIMILITARISTE : la motivation sémantique 
ANTIMILITARISTE <— MILITAIRE 


Puisque la forme de ANTIMILITARISTE ne coïncide pas avec sa construction sémantique, 
c'est dans le voisinage dérivationnel de l'adjectif que l'on va chercher la motivation de 
sa structure morphologique. Le nom (et adjectif) MILITARISTE répond à cette exigence. 
En effet, formellement, MILITARISTE est une instance du patron Xist, et entretient une re- 
lation d'interprédictibilité avec ANTIMILITARISTE, anti- apparaissant fréquemment dans 
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des structures comportant une finale en /ist/7. C'est ce qui est représenté dans la figure 6. 
En revanche, la relation entre ANTIMILITARISTE et MILITARISTE ne répond à aucune mo- 
tivation sémantique comme l'indique l'absence de relation d'interprédictibilité entre les 
catégories sémantiques CONTRE et PARTISAN dans la figure 6 : en l'occurrence, l'émer- 


gence d'un comportement adversatif (CONTRE) ne requiert pas l'existence d'un PARTI- 
SAN. 


EE — 
f \ 
\ CONTRE PARTISAN } 
E à 
der pm 
` € a H 
d er 
p" CF. 
` S d = 

` ‘ / S S 

H d * \ 

: Oo [s] Kaal 

` 
* T B pe 
\ E ee 
* , oo m e 
* H A D 
` e ee H 
eae Ü n m A 
z= A + 2 E 
GE " ` ^ v í à N 
7—  [ANTIMILITARISTE, p 


Figure 6 : Élément de l'analyse de ANTIMILITARISTE : la motivation formelle 
ANTIMILITARISTE <— MILITARISTE 


PARTISAN, la catégorie sémantique de MILITARISTE, est en contrepartie indissociable 
de celle de l'objet valorisé, qui peut étre une idéologie (le pointillisme, pour le pointilliste), 
un individu (Sarkozy, pour le Sarkoziste), une fonction (le pape, pour un papiste), une 
activité (bouger, pour le bougiste), un objet concret (la viande, pour le viandiste), etc. C'est 
en d'autres termes une entité conceptuelle non contrainte, que nous représentons par la 
classe ENTITÉ (voir Roché (2007, 2011a) pour une analyse détaillée des suffixations en 

-isme et -iste en francais). La relation est également prédictible dans le CF : la suffixation 
en -iste présente une affinité notable avec les structures comportant une finale en /ex/?. 
L'assemblage des quatre composants, illustré par la figure 7, montre que MILITARISTE 


forme avec MILITAIRE un module sémantiquement et formellement régulier : la géométrie 
dans les quatre composants est isomorphe. 


7Dans TLFindex par exemple, 11% des adjectifs de la forme átiX finissent en -iste (i.e. sont des instances de 
àtiX ist). 


*Les noms et adjectifs en Xaxist forment 4% des entrées en Viet dans TLFindex. 
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Figure 7 : Module régulier (MILITARISTE, MILITAIRE) 


En rassemblant les éléments d'analyse que nous venons de présenter, on voit que la 
forme et le sens de ANTIMILITARISTE résultent d'une combinaison de facteurs qui inter- 
viennent de facon inégale : 


1. ANTIMILITARISTE et MILITAIRE sont sémantiquement motivés l'un par l'autre (fi- 
gure 5); 


2. ANTIMILITARISTE et MILITARISTE sont formellement motivés l'un par l'autre (fi- 
gure 6); 


3. MILITARISTE et MILITAIRE sont sémantiquement et formellement connectés (figure 7). 


Cette convergence de propriétés fait intervenir l'unification, au niveau du CF, du X de la 
figure 7 avec le Yeg de la figure 5, ce qui conduit à la spécification (21b) de la relation for- 
melle (21a) de la suffixation en /ist/. La variation /cx/-/ax/ en (21b) est due à la proximité 
de la voyelle /e/ avec le /i/ dans /ist/ : 


(21 a X —  Xiste 
T T 
b. Yaire —  Yariste 


Le résultat, présenté dans la figure 8, est un module dont les trois composants sont 
entiérement interconnectés, avec un composant lexical formant un graphe complet, et 
les composants sémantique et formel constituant chacun un graphe connexe acyclique 
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dont les sommets reliés sont différents. Comme on peut le voir, la figure 8 est une simple 
superposition des sous-modules des figures 5, 6 et 7. La non-coincidence entre les trois 
composants abstraits dans la figure 8 se manifeste dans la géométrie des composants du 
module complet de ANTIMILITARISTE. Elle contraste avec la géométrie réguliére du mo- 
dule de (VIDER, VIDEUR, VIDAGE) illustré par la figure 4 dont la canonicité paradigmatique 
se traduit par la coprésence de trois triangles isomorphes. 
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Figure 8 : Module décrivant à l'analyse de ANTIMILITARISTE 


6.5 Pour récapituler 


Le modèle ParaDis résulte d'un triple héritage : il s'inspire des Patrons Cumulatifs de 
Bochner qui essentiellement décrivent les composantes formelles et catégorielles de la 
dérivation morphologique. ParaDis les étend à la dimension sémantique des paradigmes 
et tire parti du fonctionnement indépendant et simultané des composants formel, caté- 
goriel et sémantique des RCL et de la nature tri-dimensionnelle des lexémes sur lesquels 
elles s'appliquent. Enfin, ParaDis adopte, dans le but de la formaliser, l'organisation en 
réseau de la morphologie constructionnelle initiée par l'axe DUMAL qu'il complète en 
les articulant avec les structures paradigmatiques de famille et de série dérivationnelles. 

De cette manière, la distribution et le traitement des informations morphologiques de 
ParaDis servant à réaliser l'analyse des constructions morphologiques, et notamment des 
dérivés parasynthétiques, s'effectue sur trois plans : 


— suivant les trois dimensions classiques du lexéme; 
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— auprés des membres de la famille dérivationnelle du dérivé que l'on souhaite dé- 
crire; 


— à travers les relations entre les séries dérivationnelles dans lesquelles s’insérent 
les lexèmes de cette famille. 


Avec cette organisation multi-niveaux ParaDis peut appréhender la construction mor- 
phologique aussi bien sous forme de relations binaires, que du point de vue de modules 
plus complexes instanciant les réseaux de motivation paradigmatiques des dérivés mor- 
phologiques; l'organisation proposée permet de traiter de manière uniforme tous les 
types de dérivés, quel que soit leur éloignement vis-à-vis de la situation idéale de trans- 
parence formelle et sémantique. Relativement aux modèles qui l’on précédé, ParaDis peut 
donc traiter les apparentes anomalies constructionnelles que manifestent les dérivés pa- 
rasynthétiques, sans recourir à des artéfacts analytiques : les mécanismes qui servent à 
les analyser sont strictement identiques à ceux qui permettent d'analyser les dérivations 
canoniques. Les relations formelles et sémantiques asynchrones qui induisent leur écart 
relativement à la situation canonique sont envisagées de maniére disjointe, se traduisant, 
dans le cas de la parasynthése, par une autonomisation de la motivation du préfixe et 
de la séquence suffixale. La disponibilité de la famille du dérivé parasynthétique, distri- 
buée dans les différents composants, et sa structure en réseaux permettent de calculer la 
forme appropriée de la séquence finale. 
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Chapter 16 


Much ado about morphemes 


Héléne Giraudo 
CLLE, Université de Toulouse, CNRS, Toulouse, France 


Most of the psycholinguists working on morphological processing nowadays admit that 
morphemes are represented in long-term memory and the predominant hypothesis of lex- 
ical access is morpheme-based as it supposes a systematic morphological decomposition 
mechanism taking place during the very early stages of word recognition. Consequently, 
morphemes would stand as access units for any item (i.e., word or nonword) that can be split 
into two morphemes. One major criticism of this prelexical hypothesis is that the mecha- 
nism can only be applied to regular and perfectly segmentable words and, more problematic, 
it reduces the role of morphology to surface/formal effects. Recently, Giraudo & Dal Maso 
(2016) discussed the issue of morphological processing through the notion of morphological 
salience — as defined as the relative role of the word and its parts — and its implications for 
theories and models of morphological processing. The issue of the relative prominence of 
the whole word and its morphological components has indeed been overshadowed by the 
fact that psycholinguistic research has progressively focused on purely formal and superfi- 
cial features of words, drawing researchers' attention away from what morphology really is: 
systematic mappings between form and meaning. While I do not deny that formal features 
can play a role in word processing, an account of the general mechanisms of lexical access 
also needs to consider the perceptual and functional salience of lexical and morphological 
items. Consequently, if the sensitivity to the morphological structure is recognized, I claim 
that it corresponds to secondary and derivative units of description/analysis. Focusing on 
salience from a mere formal point of view, I consider in the present contribution how a 
decompositional hypothesis could deal with some phonological endings whose graphemic 
transcriptions are various. To this end, a distributional study of the final sound [o] in French 
is presented. The richness and the diversity of the distributions of this ending (in terms of 
type of forms, size and frequency) reveal that paradigmatic relationships are more suitable 
to guide morphological processing than morphological parsing as suggested by the lexeme- 
based approach of morphology (see Fradin 2003). 


1 Introduction 


In the domain of linguistics, morphological analysis is conceived according to two an- 
tagonistic approaches. On the one hand, the morpheme-based approach (exemplified by 
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the theoretical framework of Distributed Morphology, see Halle & Marantz 1993, 1994) 
integrates morphology with syntax and considers morphemes as basic minimal forms. 
On the other hand, the lexeme-based approach postulates that words are the first units of 
analysis (e.g. Corbin 1987, Aronoff 1994, Fradin 1996) . Psycholinguistic research aiming 
to understand the cognitive processes underlying word processing has broadly explored 
the effects of morphological processing on the underlying processes of lexical access. 
Whereas it was widely admitted that morphological information plays a crucial role dur- 
ing word processing, its representation is still controversial. Nowadays most psycholin- 
guists support a decompositional view of morphological processing (see Rastle & Davis 
2008 for a review) that can be linked to the morpheme-based hypothesis, while a few 
of them defend an opposing view according to which words are recognized holistically. 
This last procedural hypothesis, which is clearly in line with the lexeme-based approach, 
is tested in the present chapter through a qualitative and quantitative study of words end- 
ing in [o]. The distribution of this ending is so diverse that it would cause a huge number 
of procedural errors of morpheme decomposition. Conversely, the lexeme-based/holistic 
approach to morphology seems to be much more appropriate to encompass the diversity. 


2 Studying morphological processing 


In a seminal experimental study carried out by Taft & Forster (1975) on the recognition of 
nonwords, the idea of morphological decomposition was first introduced. They showed 
that 1) nonwords (e.g., juvenate) corresponding to an English stem induced longer re- 
jection latencies than nonwords that were not stems (e.g., pertoire) and 2) prefixed non- 
words constructed with an English prefix and stem (e.g., dejuvenate) took longer to be 
classified compared to morphologically simple control items (e.g., depertoire). Longer 
decision latencies were interpreted as reflecting a pre-lexical mechanism of morpholog- 
ical decomposition by which all the words (real or possible) would be accessed via the 
first activation of their stem. Forty years of experimental research have been focused on 
testing this decomposition hypothesis by manipulating the characteristics of morpho- 
logically complex words and nonwords (i.e., their form in terms of decomposability and 
interpretability, their lexical frequency and more rarely their lexical environment) in var- 
ious perceptual tasks (with a large dominance of the lexical decision task which consists 
in a word/nonword discrimination) and numerous languages (most studies focusing on 
English, however). Most of the results have been interpreted as supporting the decompo- 
sitional view (see the reviews of Amenta & Crepaldi 2012, Diependaele et al. 2012) with- 
out really questioning the linguistic processes underlying the construction of complex 
words. An overview of the tested hypotheses and the materials used to explore complex 
word recognition indeed reveals a lack of consideration of the paradigmatic characteris- 
tic of words for understanding the cognitive mechanisms underlying lexical access. Nu- 
merous studies mainly focused on the formal properties of the word and extended the 
morphological sensitivity effects observed with complex nonwords to complex words 
(e.g. Taft & Forster 1976, Caramazza et al. 1988, Laudanna et al. 1997, Crepaldi et al. 2010) 
failing to consider semantic aspects of morphological complexity. Many experimental re- 
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ports examined morphological processing using the masked priming paradigm (Forster 
& Davis 1984) that is supposed to reflect the automatic and nonconscious processes en- 
gaged in the very early stages of word recognition. In this paradigm, two visually related 
items are presented successively and participants are asked to perform a lexical decision 
indicating whether the second item is a word or not. However, because the prime word 
is presented masked and very briefly, the reader is not even aware of its presence be- 
fore seeing the target item.! Hence, the paradigm allows examination of the effects of 
the unconscious processes of the prime processing on the target recognition (see Ki- 
noshita & Lupker 2004 for a review on masked priming) . Many masked priming studies 
demonstrated that when two words are morphologically related (e.g., singer-sing), the 
prior presentation of the prime shortens the recognition latency of the target relative 
to both a baseline condition in which the prime is completely unrelated to the target 
(e.g., baker-sing) and an orthographic condition that uses a prime that is only formally 
related to the target (e.g., single-sing). Accordingly, morphological priming effects do 
not result from the mere formal overlap shared by prime-target. Other studies showed 
that semantic priming effects (e.g., cello-violin) only arise when the prime duration is 
sufficiently high (i.e, > 72 ms, see Rastle et al. 2000 for a comparison between mor- 
phological, orthographic and semantic priming effects using different Stimulus-Onset 
Asynchronies). This general result suggests that priming effects result from morpholog- 
ical relationships shared by prime-target pairs and that morphologically related words 
are connected by some kind of excitatory links. Most of the models of lexical access have 
tried to account for these morphological effects. 


3 Psycholinguistic models of morphological processing 


The architecture of psycholinguistic models of word recognition is mostly based on 
symbolic interactive activation models (e.g. McClelland & Rumelhart 1981). This type of 
model is organized in hierarchical levels of processing containing symbolic units. Each 
level corresponds to a linguistic characteristic of words, from letter features to seman- 
tics. During word recognition, activation spreads from the lowest to the highest levels. 
Within-level units are connected by inhibitory links whereas inter-level units are by 
excitatory links. Consequently, the model functions according to a principle of compe- 
tition between within-level units that is compensated by both bottom-up and top-down 
excitations. The independence of the morphological effects relative to mere formal and 
semantic effects being established, morphological information was usually represented 
as a separate level of processing. However, its locus relative to the formal level (phono- 
logical and orthographic descriptions of the words) and the semantic level is still con- 
troversial. Morphological units have been situated variously: before the formal level and 
stand as access units to the mental lexicon (see Figure 1a depicting the sublexical model, 
Taft 1994), at the interface of the formal and the semantic level, organizing the word rep- 
resentations in morphological families (see Figure 1b, the supralexical model, Giraudo 


1The Stimulus Onset Asynchrony is usually less than 50 milliseconds, it corresponds to a subliminal process- 
ing. 
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& Grainger 2001) or at either places, before and after the formal level (see Figure 1c, the 
hybrid/dual route model, Diependaele et al. 2009 ; see also Diependaele et al. 2013). 


Concepts 
Concepts Concepts Morpho-semantic units 
Words Morphemes Words 


Morpho-orthographic 


unit 


4 d $ 


complex/pseudo complex complex/pseudo complex 
word word 


Morphemes Words 


complex word 


(a) Sublexical model (b) Supralexical model (c) Hybrid model 


Figure 1: Alternative hierarchical models of morphological processing. 


These three options nevertheless assume morpheme representations and by extension, 
propose a decompositional view of morphology. The sublexical and the hybrid models of 
morphological processing actually state very clearly that complex words are systemati- 
cally decomposed into morphemes during lexical access. This decomposition mechanism 
is reflected by the obligatory activation of morphemes to gain the word representations 
coded within the mental lexicon. Each time a complex or a pseudo complex word (ie. 
a word with a surface morphological structure like for example the word corner which 
comprises a surface stem corn- and a surface suffix -er) is processed by our cognitive sys- 
tem, it triggers the activation of its constituent morphemes that successively activate the 
wordforms containing it. Moreover, the hybrid model supposes that “In a priming con- 
text, opaque morphological relatives will only be able to prime each other through shared 
representations at the morpho-orthographic level, whereas transparent items will also 
be able to do this via shared representations at the morpho-semantic level” (Diependaele 
et al. 2009: 896). Even if the authors claim that morphological representations per se are 
not simply represented at both levels - the first being orthographically constrained and 
the second semantically constrained - these two levels actually correspond to surface 
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morphemes at least as far as the contained units are concerned. In these two frame- 
works (sublexical and hybrid models), morphological priming effects result from the 
pre-activation of the morpheme shared by the prime and the target before accessing the 
word representations. These morphemic units pre-select in a way the wordforms that 
can potentially match with the target to be recognized. Lexical access takes place via the 
obligatory activation of surface morphemes. 

One major criticism of the prelexical hypothesis is that this mechanism can only be 
applied to regular and perfectly segmentable words. Even more problematic is the fact 
that it reduces the role of morphology to surface/formal effects. This is certainly why 
Diependaele and colleagues proposed a second level of representation for morphology, 
as numerous experimental studies showed that two morphologically related but ortho- 
graphically unrelated words (e.g., bought-buy) prime each other. However, this solution 
only considers morphology from its syntagmatic dimension: that is according to the 
word internal structure. Therefore, nothing is said about the influences of family and 
series” on word representations. 

The original version of the supralexical model (Giraudo & Grainger 2001) also inte- 
grated morphemes even though it did not suppose a decomposition mechanism by which 
word representations are decomposed properly in order to activate their semantic rep- 
resentations. On the contrary, the morphological level contained “emerging” base mor- 
phemes, that is, morpheme representations resulting from the acquisition of complex 
words that are derived from the same base or the same series. Accordingly the mor- 
phological node organizes the word level in paradigms (i.e., morphological families and 
series), morphologically related words being connected together thanks to a supralexi- 
cal node. Concretely, when the system processes a complex word, it first activates all 
the word representations that match formally with it while at the same time the com- 
plex forms activate their common nodes that feed back positively these forms. As all 
units belonging to the same level compete with each other, the activated formally re- 
lated words inhibit each other, but those which are also morphologically related receive 
facilitation from their shared node. Words from the same family are then less inhibited 
than the other representations at the word level. In masked priming, the morphological 
facilitation between two morphologically related words observed relatively to two un- 
related words is explained in terms of a reduced inhibition effect compared to a regular 
inhibition effect for unrelated items. 


4 The benchmark effects: lexicality, frequency, regularity 


Among the factors that have been manipulated in order to better understand the nature 
of morphological relationships and the locus of morphological priming effects within 
the mental lexicon, one can cite lexicality, frequency and regularity. Starting from the 
dominant hypothesis according to which words are first decomposed before accessing 


?The term ‘series’ was, to our knowledge, first introduced by Hathout (2005, 2008) and refers to groups of 
words sharing the same affix. 
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the mental lexicon, some authors used the masked priming paradigm to study the influ- 
ence of lexicality (i.e., comparing the processing of existing words coded in the mental 
lexicon relative to non-existing but morphologically structured items) in word recog- 
nition. A series of masked priming studies examined the effect of complex nonword 
primes during the early processes of lexical access. For example, Longtin & Meunier 
(2005) have tested the effects of nonwords constructed using legal and illegal combina- 
tions of existing stems and suffixes in French (e.g., legal: infirmiser-infirme ‘disabled+er’— 
‘disabled’; illegal: garagité-garage ‘garage+ité’—‘garage’) and found that both types of 
nonwords produced facilitation relative to orthographic control primes (e.g., rapiduit- 
rapide, 'fast-uit'— fast’), that did not induce any significant effect on word recognition 
(see also, McCormick et al. 2009, Morris et al. 2013 for English materials). Giraudo & 
Voga (2013) replicated these results using French prefixed nonwords (e.g., infaire—faire, 
'un-do'- do") suggesting that these effects apply to all affixed items. Andoni Dunabeitia 
et al. (2008), focused on affix priming in Spanish and showed that isolated suffixes (e.g., 
dad-igualdad, 'ity'-'eguality") and suffixes in neutral context (e.g., #####dad—igualdad) 
were also able to induce positive priming effects (see also Crepaldi et al. 2016 using En- 
glish suffixed related nonword pairs like sheeter-teacher). Finally, Crepaldi et al. (2013) 
examined reversed compounds like fishgold-goldfish and observed facilitation within 
related prime-targets pairs. 

Taken together these studies suggest that in the early stages of word recognition 
- in masked priming conditions in which primes are presented less than 50-60 ms - 
lexicality does not impact lexical access as far as complex nonwords are considered. 
Moreover, none of these studies found priming effects using orthographic nonword 
primes (e.g., blunana-blunt tested by McCormick et al. 2009) suggesting a pre-lexical 
morphological analysis of the primes, blind to lexicality. However, even if these data 
seem to strengthen the pre-lexical decomposition hypothesis, results obtained using 
nonword primes created by letter transpositions have to be considered. Following, the 
discovery in Cambridge University according to which “it deosn't mttaer in waht oredr 
the Itteers in a wrod are, the olny iprmoetnt tihng is taht the frist and Isat Itteer be at 
the rghit pclae... it doesn’t matter in what order the letters in a word are, the only im- 
portant thing is that the first and last letter be at the right place” (see http://www.mrc- 
cbu.cam.ac.uk/personal/matt.davis/Cmabrigde/), a series of masked priming experiments 
aimed to explore this effect. Some studies showed that reading comprehension of jum- 
bled words are more or less costly (as demonstrated for example by Rayner et al. 2006), 
this effect still constitutes a challenge for the decompositionalists. It indeed contradicts 
the hypothesis according to which lexical access takes place via the obligatory decom- 
position of complex words into morphemes. Masked priming experiments explored rep- 
etition priming effects (i.e., the same stimulus is presented as prime and target, like in 
table-table) and morphological priming effects using jumbled primes and Beyersmann 
et al. (2012) first found that relative to unrelated primes, both repeated simple primes 
(e.g., wran—warn) and morphological primes (e.g., wranish-warn) reduced the latencies 
of target word recognition (see also Christianson et al. 2005, Duñabeitia et al. 2007 for 
Spanish and Basque). However, when orthographic primes (e.g., wranel-warn) were ma- 
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nipulated, no facilitation priming was observed highlighting the need for priming ef- 
fects to keep the morpheme boundary intact. Then, a series of experiments compared of 
primes with Transposed Letters (TL) at the morpheme boundary (e.g., speaekr-speak) vs. 
outside the morpheme boundary (e.g., spekaer-speak). Only one experiment in the litera- 
ture reported a benefit for TL primes when the transposition fell within the morpheme; 
no benefit was observed when the transposition fell across the morpheme boundary 
(Dunabeitia et al. 2007 , using Spanish materials). Subsequent investigations in both En- 
glish and Spanish failed to replicate these findings (Beyersmann et al. 2012, 2013, Rueckl 
& Rimzhim 2011, Sanchez-Gutiérrez & Rastle 2013) and obtained equivalent facilitation 
when the transposed letters appeared within a stem or across a morpheme boundary. 

Because TL benefit is not affected by the position of the TL relative to the morpheme 
boundary, I consider this result as a strong challenge for any decompositionalist model. 
If morphologically complex stimuli are indeed systematically decomposed into mor- 
phemes before gaining the mental lexicon, the main predictions of such models is that 
when the morphemes boundary is disrupted, no priming effect is expected since the 
cognitive system cannot parse the item into potential morphemes. 

Diependaele et al. (2013) furthermore investigated the TL effect by comparing seman- 
tically transparent vs. opaque complex primes. Their first experiment showed that rel- 
ative to formal primes, both transparent and opaque primes induced positive priming 
(e.g., banker-bank = corner-corn > scandal-scan). However, when morphological primes 
with TL were used, the transparent ones produced priming while the opaque ones did 
not (e.g., baneker-bank > corenr-corn = scandal-scan). A second experiment manipu- 
lated derived nonword primes in order to examine the effect of lexicality on the TL 
effect. Materials were selected from Longtin & Meunier's 2005 study and the authors 
found, on the one hand, that relative to unrelated primes, both intact derived word 
primes and intact derived nonword primes facilitated target recognition equally (e.g., 
garagiste-garage = garagité-garage > diversion-garage). On the other hand, when com- 
parable morphological primes with TL were manipulated, a different pattern of priming 
emerged: only derived primes induced priming (e.g., garaigste-garage » garaigté-garage 
- diverison-garage). According to the authors, these data are line with the predictions 
of their hybrid/dual route model of morphological processing (presented above in Fig- 
ure 3) in which complex items are automatically parsed within two morphological lev- 
els: morpho-orthographically and morpho-semantically, reflecting two sources of mor- 
phemic activation in word recognition. Morphological complex words (e.g., banker) are 
actually supposed to be processed twice at both morphemic levels, and pseudo-complex 
words (e.g., corner) once at the morpho-orthographic level, letter transposition across the 
morpheme boundary should interfere more with morpho-orthographic than morpho- 
semantic processing. Accordingly, transparent words and nonwords with TL are sup- 
posed to resist letter transpositions thanks to the morpho-semantic activation while 
opaque words and nonwords with TL did not because the morphemic activation at the 
morpho-orthographic level would be skipped. 

According to me, the dual route model and the way masked priming effects are inter- 
preted in this study are far from being convincing. "Ihe key prediction of this account 
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is that fast-acting effects of morphology are not only morpho-orthographic in nature, 
but also morpho-semantic, and most importantly, that these effects reflect two separate 
sources of morphemic activation in word recognition” (p. 989). 

If genuine complex words benefit from two sources of activation (morpho-ortho- 
graphic and morpho-semantic) and pseudocomplex words from one only (morpho-or- 
thographic), words like banker should be more efficient primes than corner. Nevertheless, 
their results (experiment 1) and the ones obtained so far in the literature demonstrate on 
the contrary that prime-target pairs like banker-bank and corner-corn produce equivalent 
priming effects (cf. surface morphology effects, see Rastle & Davis 2008 for a review). 
When TL effects are considered, it has been shown that primes with TL at the mor- 
pheme boundary (e.g., banekr-bank) and within the stem (e.g., bakner-bank) both induce 
equivalent facilitation effects. If the morpho-orthographic level is much more sensitive 
to letter order than the morpho-semantic level is, then one should have observed greater 
priming effects when the morpheme boundary of the prime is intact (e.g., bakner-bank) 
because two sources of activation could operate while for jumbled morpheme boundary 
(e.g., banekr-bank) only one source is active. The results obtained so far did not show 
any difference between these two types of primes, neither in the present paper, nor in 
the literature. Moreover, Diependaele et al. (2013) found in their experiment 2 that TL let- 
ter derived primes (e.g., banekr-bank) produced faster reaction times than intact primes 
(e.g., banker-bank). This surprising result is also very problematic for a decompositional 
account since the letter recoding for the TL primes that is necessary to activate mor- 
phemic representations should have delayed lexical access, therefore reducing priming. 

Word processing is also closely linked to input frequency. This factor that has been 
broadly studied in the psycholinguistic literature on word recognition showing a strong 
and very robust correlation between lexical frequency and recognition latencies: the 
higher the frequency, the shorter the reaction time (see Ellis 2002 for a review). Gener- 
ally, these experimental studies oppose derived or inflected words of comparable surface 
frequency, but crucially differing in their stem frequency (high vs. low). In this kind of 
study, when reaction times (RTs) were found to be a function of the stem frequency, 
this is considered as evidence of the fact that word recognition implies the activation 
of the stem. For example in Italian, Burani & Caramazza (1987) investigated derived suf- 
fixed forms (verbal roots combined with highly productive suffixes such as -mento, -tore, 
-zione) by opposing stimuli matched for whole word frequency, but differing in root fre- 
quency (experiment 1), to stimuli matched for root frequency but differing in whole word 
frequency (experiment 2). Their results indicated that reaction times were influenced by 
both root and whole word frequencies (faster RTs were obtained for items containing 
a high frequency root in experiment 1 and for higher whole word frequency items in 
experiment 2), the authors suggested that the access procedure crucially operates with 
both whole word and morpheme access units. Frequency effects have been observed also 
in French by Colé et al. (1989), who similarly considered derived words matched for sur- 
face frequency but differing in their cumulative root frequency (e.g., jardinier ‘gardener’, 
containing a high frequency root, vs. policier ‘policeman’, containing a low frequency 
root). Since a clear cumulative root effect was observed only for suffixed words but not 
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for prefixed ones, Colé and colleagues suggest that only the former are accessed through 
decomposition via the root. 

More recently, Burani & Thornton (2003) conducted a study on the interplay between 
the frequency of the root, the frequency of the suffix and the whole word frequency 
in processing Italian derived words. More precisely, in experiment 3, they considered 
low frequency suffixed words that differed with respect to the frequency of their mor- 
phemic constituents. As expected, the results showed that lexical decisions were faster 
and more accurate when the derived words included two high-frequency constituents 
(e.g., pensatore ‘thinker’) and slowest and least accurate when both constituents had low 
frequency (e.g., luridume ‘filth’ ). Interestingly, when the derived words included only 
one high-frequency constituent (either the root or the suffix), the lexical decision rate 
was found to be a function of the frequency of the root only, irrespective of suffix fre- 
quency. The authors conclude that access through activation of morphemes is beneficial 
only for derived words with high frequency roots, while lexical decision latencies to suf- 
fixed derived words are a function of their surface frequency when they contain a low 
frequency root. 

To sum up, frequency effects have been considered as a diagnostic for determining 
whether an inflected or derived form is recognized through a decompositional process 
that segments a word into its morphological constituents or through a direct look-up of 
a whole word representation stored in lexical memory. Frequency has therefore played a 
crucial role in the debate which opposed full parsing models, which assume a prelexical 
treatment of the morphological constituents with a consequent systematic and compul- 
sory segmentation of all complex words (Taft & Forster 1975, Taft 1979), and full listing 
models, which defend a non-prelexical processing of the morphological structure and a 
complete representation of all morphologically complex words (see McClelland & Rumel- 
hart 1981). 

Despite the importance of the frequency for lexical access (the more frequent a word, 
the faster its recognition, see Solomon & Postman 1952) and the number of priming stud- 
ies focused on its impact for word recognition (see Kinoshita 2006 for a review), very few 
studies manipulated frequencies using masked morphological priming. In a paradigm 
such as masked priming in which the prime is presented for a very brief duration, fre- 
quency is nevertheless a crucial factor since it determines the access speed to lexical 
representations. Moreover, clear opposite predictions can be derived for the two main 
approaches of morphological processing. According to the decompositional approach, 
only the root/stem frequency should interact with morphological priming effects since 
complex words are supposed to be accessed via the activation of their stem. The holistic 
hypothesis predicts no stem frequency effect but that surface frequency strongly deter- 
mines masked morphological priming effects because lexical access takes place on the 
whole word. Giraudo & Grainger (2000) investigated the interaction of both frequen- 
cies with morphological processing through a series of masked priming experiments 
conducted in French. They manipulated the surface frequencies of derivatives used as 
primes for the same target (high frequency primes like amitié-ami ‘friendship’ - ‘friend’; 
low frequency primes likeamiable-ami ‘friendly’—‘friend’). They found an interaction 
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between priming effects and the prime surface frequency (experiment 1), but no effect 
for the base frequency. Experiments 1 and 3 demonstrated that the surface frequency of 
morphological primes affects the size of morphological priming: high surface frequency 
derived primes showed significant facilitation relative to orthographic control primes 
(e.g., amidon—ami ‘starch’—“friend’), whereas low frequency primes did not. The results 
of experiment 4 revealed, conversely, that cumulative root frequency does not influence 
the size of morphological priming on free root targets. Suffixed word primes facilitated 
the processing of free root targets with low and high cumulative frequencies. These data 
suggest that during the early processes of visual word recognition, words are accessed 
via their whole form (as reflected by surface frequency effects) and not via decomposi- 
tion (since the base frequency did not interact with priming). 

Another piece of evidence against the decompositional hypothesis comes from the 
study conducted by Giraudo & Orihuela (2015), which considered the effects of the rel- 
ative frequencies of complex primes and their base target opposing the configuration 
with high frequency primes/low frequency targets to the configuration with low fre- 
quency primes/high frequency targets in French. Their results revealed that, relative to 
both the orthographic and unrelated conditions, morphological priming effects emerged 
only when the surface frequency of the primes is higher than the surface frequency of 
the targets (see also Voga & Giraudo 2009 for a similar conclusion). Again, these data 
contradict the prediction of the classical decomposition hypothesis, according to which 
the reverse effects would be expected. 

The interpretation of frequency effects with respect to psycholinguistic models, how- 
ever, remains very controversial. McCormick, Brysbaert, et al. (2009) defend a com- 
pletely opposite position, in favour of an obligatory decomposition of all kinds of stimuli 
(even for the non-morphologically structured ones). They carried out a masked priming 
experiment manipulating the frequency of the primes, thus comparing high frequency, 
low frequency and nonword primes. Their hypothesis was that if morphological decom- 
position was limited to unfamiliar words, as predicted by the horse-race style of dual- 
route models, then priming should be limited to the last two conditions. On the contrary, 
if morphological decomposition was routine, an obligatory process applying to all mor- 
phologically structured stimuli should occur in all three conditions. The results showed 
that the priming effect observed with high frequency primes was equivalent to the one 
observed with low frequency primes and with nonword primes. Such findings seem to 
confirm the claim that a segmentation process is not restricted to low frequency words 
or nonwords, as assumed by horse-race models. 

Very recently, the masked priming study carried by Giraudo et al. (2016) on Italian ma- 
terials explored the role stem frequency in morphological processing even more deeply. 
They focused on the surface frequencies of base targets (comparing high vs. low surface 
frequency targets, e.g., trasfire ‘to transfer’ vs. motivare ‘to motivate’) primed by either 
the same base (e.g., trasfire-trasfire), a derivation of the base (e.g., trasferimento-trasfire 
‘transfer’—‘to transfer’), an orthographic control (e.g., trasparenza-trasfire ‘transparence’ 
—‘to transfer’) and an unrelated control (e.g., sacrificio-trasfire ‘sacrifice - "to transfer). 
The data showed that full morphological priming effects were obtained whatever the fre- 
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quency of the targets (high or low). Accordingly, the frequency of the base contained in 
the derived primes (e.g., trasferire in trasferimento) did not interfere with morphological 
facilitation: primes whose base had a high frequency did not induce stronger facilitation 
than primes with a low frequency base. As a consequence, contrary to the predictions of 
a decompositional approach of lexical access to complex words, the prior presentation of 
a complex prime whose stem had a high surface frequency did not accelerate the access 
to its lexical representation relative to primes whose stem frequency was low. 

Taken together, the frequency effects obtained using masked priming suggest that 
lexical access depends much more on the lexical frequency of the prime (that determines 
its activation threshold) than on its the stem frequency. Stem frequency does not seem to 
interfere with the access to the mental lexicon and morphological priming effects reveal 
instead that, as soon as a lexical representation is activated within the mental lexicon, 
such a representation automatically triggers the activation of all its family members. 
The result of the overall activation of the morphological family is revealed in those LDT 
experiments in which it has been observed that both the lexical and the base frequencies 
determine the recognition latencies of suffixed words. Only models that consider the 
word as the main unit of analysis, be it morphological (e.g., Giraudo & Voga 2014) or not 
(e.g., Baayen et al. 2011), are able to account for these findings. 

Finally regularity is another factor from which opposite predictions can be drawn by 
the two views of morphological processing. In the psycholinguistic literature, this issue 
is intimately linked with the ease with which a complex word can be segmented into 
morphemes. Most of these studies consider morphology under the single angle of the 
word internal structure and the reported experiments carried out with irregular words 
aimed to test the predictions of decomposition hypothesis according to which parsabil- 
ity should interact with the magnitude of morphological priming effects. Regularity 
has been mainly tested with irregular materials like the irregular verbs in English (e.g., 
bought-buy) and with complex words containing various orthographic alterations (e.g., 
bigger-big). Pastizzo & Feldman (2002) carried a series of masked priming experiments 
on English irregular verbs (viz. allomorphs). They found that allomorphs (e.g., fell) whose 
construction enables decomposition, primed their verbal base (e.g., fall) more than or- 
thographically matched (e.g., fill) and unrelated control words (e.g., hope) did. Contrary 
to the predictions of the decompositional hypothesis, non-segmentable complex words 
then induce priming effects that cannot be attributed to the formal overlap between 
prime-target pairs but depend on the morphological relationships they share. These re- 
sults have been replicated later by Crepaldi et al. (2010; see also the MEG study carried 
by Fruchter et al. 2013 leading to the same pattern of data) who were forced to admit 
the "existence of a second higher-level source of masked morphological priming" and 
proposed a lemma-level composed of inflected words acting “at an interface between 
the orthographic lexicon and the semantic system" (p. 949). 

McCormick et al. (2008) manipulated another category of derived stimuli that can- 
not be segmented perfectly into their morphemic components (for example, missing 
"e (e.g., adorable-adore), shared e (e.g., lover-love), and duplicated consonant (e.g., 
dropper-drop) in order to test the flexibility of the morpho-orthographic segmentation 
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process described by morpheme-based models. Once again, their results demonstrate the 
robustness of this segmentation process in the case of various orthographic alterations 
in semantically related (e.g., adorable-adore) as well as in unrelated prime-target pairs 
(e.g., fetish-fete). The same authors then addressed the same question using morpho- 
logically structured nonword primes (McCormick et al. 2009). To this end, they created 
nonword primes with a missing <e> at the morpheme boundary (e.g., adorage-adore) 
and compared it to orthographically related prime-target pairs (e.g., blunana-blunt). The 
observed data showed that morphologically structured nonword primes facilitated the 
recognition of their stem targets, and that the magnitude of these priming effects was 
significantly larger than for orthographic control pairs. They interpreted this result as 
supporting their previous conclusions on word primes (2008) according to which stems 
that regularly lose their final «e» may be represented in an underspecified manner (ie. 
absent or marked as optional). But far to call the decomposition mechanism into ques- 
tion, they claimed that the process of morphological decomposition was robust to regular 
orthographic alterations that occur in morphologically complex words. 

According to me, these results could be interpreted on the contrary as being totally 
incompatible with the hypothesis of a mandatory decomposition process based on the 
surface morphology because this mechanism is only based on a minimalist condition of 
having two surface morphemes. If not, the decompositionalist approach needs to explain 
to how/on which criteria these words are actually decomposed. So far, the decomposi- 
tionalists only proposed the idea of fast acting morphological effects (see Diependaele 
et al. 2013) without specifying on what visual/perceptual base these effects could ac- 
tually operate. Recently, Giraudo & Dal Maso (2016) discussed this issue through the 
notion of morphological salience and its implications for theories and models of mor- 
phological processing. More precisely, the impact of the salience of complex words and 
their constituent parts on lexical access was questioned in light of the benchmark ef- 
fects reported in the literature and the way they have been unilaterally interpreted. The 
issue of the relative prominence of the whole word and its morphological components 
has been indeed overshadowed by the fact that psycholinguistic research has progres- 
sively focused on purely formal and superficial features of words, drawing researchers' 
attention away from what morphology really is: systematic mappings between form and 
meaning. While I do not deny that formal features can play a role in word processing, an 
account of the general mechanisms of lexical access also needs to consider the perceptual 
and functional salience of lexical and morphological items. Consequently, the existence 
of morphemes is then recognized, but we claimed that it corresponds to secondary and 
derivative units of description. I hold that results obtained on the basis of masked prim- 
ing are in line with holistic models of lexical architecture in which morphology emerges 
from the systematic overlap between forms and meanings (Baayen et al. 2011 )? and 
for which the lexeme is the first unit analysis for the cognitive system. In such models, 
salience is not only a matter of internal structure, but also results from the organization 
of words in morphological families and series. As a consequence, not only syntagmatic, 


? And also to abstractive approaches assuming that “the lexicon consists in the main of full forms, from 
which recurrent parts are abstracted" (Blevins 2006: 537). 
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but also paradigmatic relationships contribute to morphological salience. Certainly, the 
notion of salience refers primarily to formal aspects, because the perceptual body of the 
morpheme is necessarily the starting point of the processing mechanism. However, the 
notion of salience makes sense for complex word processing only if the form it refers to 
is associated with a meaning or function. Salience, in other words, is a property of the 
morpheme (i.e., a stable association of form and meaning), not simply of a phonetic or 
graphemic chain. 


5 The final sound [o] in French 


Focusing on salience from a mere formal point of view leads to consider how a decompo- 
sitional hypothesis could deal with some phonological endings whose graphemic tran- 
scriptions are various. 

I present a distributional study of the final sound [o] in French suggesting that paradig- 
matic relationships are more suitable to guide morphological processing than morpho- 
logical parsing. The data have selected from Lexique 3 database (New 2006). 

In French, the final sound [o] can be written in 9 different ways: 


(1) -au as in: 
noyau, préau, tuyau, bestiau 
‘core’, ‘courtyard’, ‘pipe’, ‘cattle’ 

(2) -aud as in: 
noiraud, rougeaud, crapaud, nigaud 
‘black+aud’, ‘red+aud’, ‘toad’, ‘idiot’ 


(3) -aut, as in: 
quartaut 
‘quarter+aut 


(4) -eau as in: 
poireau, grumeau, tableau, drapeau 


‘leek’, ‘lump’, ‘board’, ‘flag’ 


(5) -od as in: 
pernod 
‘pernod’ 

(6) -op as in: 


galop, sirop, trop 

‘gallop’, ‘syrup’, ‘too much’ 
(7) -osasin: 

gros, dos, enclos, chaos 

‘big’, ‘back’, ‘pen’, ‘chaos’ 
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(8) -ot as in: 
bistrot, cachot, chiot, jeunot 
‘pub, ‘dungeon’, ‘puppy’, ‘youngster’ 
(9) -o as in: 
auto, ado, mécano, fluo 
‘car’, ‘teenager’, ‘mechanic’, ‘fluo’ 


Among these words, one can distinguish semantically transparent complex words 
(e.g., drap-eau) M+, semantically opaque complex words (e.g., crap-aud) M-, simple 
words (e.g., trop) O and clippings (e.g., ado from adolescent) C, whose distributions in 
terms of size, i.e., number of different words sharing the same ending (N) and cumula- 
tive frequencies of these words (F) are sometimes very heterogeneous. Tables 1 and 2 
present these different distributions. 


Table 1: Number of different words having the same ending. 


Ending Distribution of the Size 
Transparent Opaque 
complex complex Simple Clippings Total N 
words (M+) words (M-) words (O) (C) (M+, M-, O) 
-au 2 3 13 5 18 
-aud 20 15 11 35 46 
-aut 0 1 22 1 23 
-eau 74 47 74 121 195 
-od 0 0 1 0 1 
-op 0 0 4 0 4 
-0S 0 0 179 0 0 
-ot 43 46 130 89 221 
-0 18 8 430 26 581 
Total 157 120 864 277 1089 


As one can see above, among the 9 possible transcriptions of the sound [o], 6 can cor- 
respond to suffixes (i.e., -au, -aud, -aut, -eau, -ot, -o). It means that 66% of these endings 
can correspond to a suffix. Moreover endings in [o] are globally carried by a larger num- 
ber of simple words (864 for O vs. 277 for M), and these simple words are much more 
frequent than complex words (13280 occ./million for O vs. 870 occ./million for M). 

If we examine the size distributions of the different transcriptions, it appears that - 
0 represents more than a half of the overall endings (581 words in -o for a total of 1089 
words ending in [o]). The ending -eau dominates among the other endings (121/277 = .44) 
and only -eau (121 complex words for 74 simple words) and -aud (35 complex words for 11 
simple words) show a morphological probability higher than an orthographic probability 
(p(M-eau) = 121/195 = .62; p(M-aud) = 35/46 = .76). All the other endings are dominated by 
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Table 2: Cumulative frequencies of words having the same ending. 


Ending Distribution of the cumulative frequency 
Transparent Opaque 
complex complex Simple Clippings Total N 
words (M+) words (M-) words (O) (C) (M+, M-, O) 
-au 6.55 34.86 5350.20 41.41 5391.61 
-aud 2050.73 67.31 184.53 108.04 292.57 
-aut 0 0 2009.41 0.20 2009.61 
-eau 169.35 300.23 1559.39 427.10 1986.49 
-od 0 0 4.73 0 4.73 
-op 0 0 868.94 0 868.94 
-0S 0 0 1596.89 0 1596.89 
-ot 1.80 4.05 1002.49 263.69 1493.18 
-0 1.17 1.05 703.47 29.48 1037.39 


Total 2229.60 407.50 13280.05 869.92 14681.41 


simple words. This means that even if 66% of [o] endings can function as suffixes, their 
morphological probability is very low (p(M) = 227/1084 = .21). Therefore, morphological 
decomposition would conduct to a procedural deadlock in 81% of the cases. Finally, when 
the N distributions of M+ words are compared to M- words, we can see that M+ globally 
dominates M- (157 vs. 120) but when each ending is examined it appears that except for 
-eau (74 vs. 47) and -o (18 vs. 8) it is more a 50/50 ratio than a clear dominance. It sug- 
gests than even when the cognitive system encounters a complex word, morphological 
decomposition is semantically useless in 50% of the cases. 

If one turns now to the details of frequency distributions, the cumulative frequencies 
of simple words are systematically higher than those of complex words, the highest 
value being associated with simple words ending in -au (5350 occurrences per million). 
As for the N distributions, the cumulated frequencies of the suffixed words ending in -eau 
dominates the other suffixed words (427 occ./million for a total of 870 occ./million). M+ 
words are much more frequent than M- words (2230 occ./million vs. 407 occ./million) 
but this dominance is explained by the cumulated frequencies of M+ suffixed words in 
-aud (2051 occ./million). When the data of -aud are removed, the cumulated frequency 
of M- words (340 occ./million ) becomes almost twice as high as the one of M+ words 
(179 occ./million). Altogether, this suggests that simple words and semantically opaque 
complex words ending in [o] should be accessed more rapidly than the semantically 
transparent complex ones. 

To sum up, the reported study of the 9 possible transcriptions of [o] according to 
the size and the cumulative frequency reveals that the probability for this phonological 
ending to correspond to a suffix is low. More importantly, the cumulative frequency of 
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suffixed words bearing a semantically transparent construction is weak relative to the 
non-suffixed words. Consequently, a decomposition hypothesis according to which any 
item bearing a structured morphological surface is first decomposed into morphemic 
constituents would lead to numerous useless prelexical mechanisms. 


6 Something is rotten in the state of the decomposition 
hypothesis 


In the present paper, I reviewed results from masked morphological priming reported in 
the literature and I highlight the shortcuts made by the decompositionalist to interpret 
some data, in particular those related to formal effects, forgetting the semantic and the 
paradigmatic aspects of morphology. Although I do not deny that morphology plays 
a role during lexical access, I doubt that fast morphological effect can operate under 
masked priming conditions (i.e., within a window of a 50-60 ms). In addition, I propose 
an alternative interpretation of its role within the mental lexical 

Recently, Giraudo & Voga (2014) proposed a revised version of the supralexical model. 
This new model is sensitive to both lexical (e.g. frequency) and exo-lexical characteristics 
of the stimuli (e.g., family size) and capable to cope with various effects induced by true 
morphological relatives (e.g., singer-sing) and pseudo relatives (e.g., corner-corn). Ac- 
cording to the model, morphological relationships are coded according to two different 
dimensions: syntagmatic and paradigmatic. The first level captures the perceptive regu- 
larity and the salience of morphemes within the language. It contains stems and affixes 
that have been extracted during word acquisition. Accordingly, during language acqui- 
sition, the most salient perceptive units (i.e., recurrent and regular) will be caught and 
coded by the cognitive system as lexical entries. At this very early level of processing, 
morphologically complex words, pseudo-derived words and nonwords whose surface 
structure can be divided into (at least two) distinct morphemes are equally processed. 
As a consequence, this level cannot properly be considered to be a morphological level, 
but rather as a level containing morcemes (from French morceau ‘piece’). Morcemes cor- 
respond to word pieces standing as access units that speed up word identification each 
time an input stimulus activates one of them. Therefore, there is no need to assume, at 
this stage, a process of morphological decomposition; this would be unnecessary. 

Contrary to the first level, the second level deals with the internal structure of words, 
their formation according to morphological rules. This level contains base lexemes, units 
abstract enough to tolerate orthographic and phonological variations produced by the 
processes of derivation and inflection. Base lexeme representations are connected to mor- 
phologically related word representations and these connections are determined by the 
degree of semantic transparency between wordforms and base lexemes. Semantically 
transparent morphologically complex words are connected both with their morphemes 
and their base lexeme. Words with a semantically opaque structure, as for example, fau- 
vette ‘warbler’ (not related anymore to its free-standing stem fauve ‘tawny’) or with an 
illusory structure, as for example baguette ‘stick’ in which bagu- is not a stem and has 
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nothing to do with bague ‘ring’, are not connected with a base lexeme. These two types 
of items are only connected with their surface morphemes situated at the morceme level. 
Indeed, the model makes the fundamental assumption that base lexeme representations 
are created in long-term memory according to a rule that poses family clustering as an 
organisational principle of the mental lexicon. This rule stipulates that as soon as two 
words share form and meaning, a common abstract representation emerges; all the in- 
coming forms respecting this principle then feed this representation. In the course of 
language acquisition and learning, family size grows and links are continually being 
strengthened. 

Finally, if we turn back to priming effects, the model postulates that it depends on 
the kind of relationships the prime entertains with the target (formal and/or semantic) 
and consequently, on the number of excitation sources that target recognition triggers: 
a) when the prime is semantically transparent and complex M+O+S+ (like in the pairs 
banker-bank or hatched-hat), its perception gives birth to three sources of excitation, 
from morcemes, wordforms and base lexemes; b) when the prime is semantically trans- 
parent, complex but not decomposable M+O-S+ (like in the prime target pair fell-fall), 
it activates two sources of excitation, from wordforms and base lexemes; c) when the 
prime is semantically opaque M+O+S- (it concerns complex or pseudo-complex words 
like apartment-apart or corner—corn), its recognition triggers two sources of excitation, 
from morcemes and wordforms; d) when the prime is not complex and not decomposable 
MO-S- (like freeze-free), it gives raise to only one source of excitation, from wordforms. 

In our view, much work still needs to be done on morphological processing, but within 
the framework of a lexical network that codes word representations as the result of both 
syntagmatic and the paradigmatic influences. Separating form from meaning, words 
from their family and series within experimental paradigms like the masked priming 
paradigm that exclusively focuses the attention of the readers on visual formal aspects, 
leads to a confirmation bias and reduces the notion of morphology to form only. It is in- 
deed very important to consider that masked priming effects do not only correspond to 
the early processes of lexical access as suggested by numerous authors, but to a picture of 
lexical access that takes place at a given time within an ocean of complex relationships. 
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Les affixes dérivationnels ont-ils des 
allomorphes ? Pour une modélisation de 
la variation des exposants dans une 
morphologie a contraintes 


Fabio Montermini 
CLLE-ERSS, CNRS & Université de Toulouse 2 Jean Jaurés 


Cet article traite des phénoménes de variation formelle en dérivation (écart entre la forme 
attendue et la forme réellement observée pour un lexéme dérivé) qui ne peuvent pas étre 
traités en termes de variation thématique, ce qui suggére que les exposants des constructions 
morphologiques peuvent à leur tour étre sujets à variation. Pour modéliser cette variation 
des exposants, je propose d'étendre la notion de contrainte non seulement à une propriété 
qui est spécifique à une langue donnée, mais également à une construction donnée. Les 
exposants des constructions morphologiques sont alors eux-mémes vus comme des (ensem- 
bles de) contraintes qui interagissent avec les autres contraintes en jeu dans la formation des 
lexémes complexes. Chaque « allomorphe » d'un exposant est donc représenté comme une 
contrainte qui, en tant que telle, peut étre hiérarchisée par rapport aux autres, ce qui rend 
compte de l'observation que certaines de ces variantes jouent un róle de défaut, alors que 
d'autres émergent uniquement dans des conditions particulières. Afin d'illustrer ce modèle, 
je propose deux études de cas de constructions morphologiques de naissance ou développe- 
ment récent. Il s'agit, d'une part, de la création de noms de locuteurs en -phone à partir 
du nom d'une langue et la création de lexémes avec un sens génériquement appréciatif / 
superlatif en -issimo. Chacune de ces deux constructions est à son tour comparée à des con- 
structions proches: la dérivation en -phone est comparée à la dérivation correspondante et 
cognate de lexémes en -fono en italien ; la dérivation en -issimo est comparée à la dérivation, 
plus canonique, de superlatifs en -issime en français. Ces comparaisons mettent en lumière 
le fait que des constructions formellement et sémantiquement similaires et qui ont la méme 
origine peuvent, dans des langues différentes ou dans la méme langue à des époques et pour 
des finalités différentes, développer des spécifications phonologiques différentes, ce qui se 
traduit, dans le cadre adopté ici, par des ensembles de contraintes différentes et/ou agencées 
différemment. 


Fabio Montermini. Les affixes dérivationnels ont-ils des allomorphes ? Pour une mo- 
délisation de la variation des exposants dans une morphologie à contraintes. In Olivier 
Bonami, Gilles Boyé, Georgette Dal, Héléne Giraudo & Fiammetta Namer (éds.), The 
| lexeme in descriptive and theoretical morphology, 423—465. Berlin : Language Science 
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1 Introduction 


Un des changements majeurs qu'a connus l'étude de la morphologie dans les derniéres 
décennies a été le glissement des modéles morphématiques, décompositionnels et combi- 
natoires vers des modéles davantage tournés vers la description des relations existantes 
entre des mots plus ou moins complexes. Une des conséquences de ce changement est 
le fait que ces relations ne sont plus analysées en termes de régles orientées, détermi- 
nistes et existant indépendamment des unités qui les incarnent, mais en ayant recours 
à des concepts comme celui de « patron » ou « schéma », plus souples, et qui rendent 
compte de la maniére dont les locuteurs établissent des généralisations à partir du lexique 
existant. C'est ce que l'on observe, par exemple, dans la Morphologie des Constructions 
(Construction Morphology), élaborée principalement par Booij (2010), mais aussi dans le 
modèle à contraintes, élaboré par Hathout (2009) et surtout dans les travaux récents de 
Marc Plénat et Michel Roché (Plénat & Roché 2014, Roché & Plénat 2014, 2016). Toutes 
ces approches sont « output-oriented », au sens qu'elles sont moins intéressées à décrire 
l'ensemble de procédures qui permettent de passer d'un input à un output (un lexéme 
(plus) complexe) qu'à rendre compte des contraintes qui pésent sur la forme (et le sens) 
d'un lexéme construit, ou, plus précisément, de tous les lexémes construits qui appar- 
tiennent à la méme série (c'est à dire, qui sont construits par la méme opération morpho- 
logique). Parmi d'autres résultats, les approches en question ont permis de rendre compte 
de manière efficace de la variation allomorphique observée dans le lexique construit, en 
particulier en ce qui concerne la sélection du théme du lexéme de base et les éventuelles 
modifications qu'il subit. En revanche, à quelques exceptions prés (notamment Lignon 
& Roché 2011), la variation de forme des exposants (celle qui est appelée traditionnelle- 
ment l'allomorphie affixale) a été peu discutée dans ce cadre. Une des raisons principales 
est certainement le fait que les approches dont il est question ci-dessus ont le plus sou- 
vent pris le parti de maximiser la complexité des représentations lexicales en simplifiant, 
parallélement, l'instruction phonologique associée aux opérations morphologiques, et 
donc de repousser, autant que possible, l'allomorphie du cóté des radicaux plutót que du 
cóté des affixes (Bonami et al. 2009, par exemple, sont trés clairs sur ce point). Pourtant, 
le fait que l'allomorphie puisse toucher aussi bien les radicaux des mots construits que 
les affixes semble souvent aller de soi, en lexicographie, dans plusieurs cadres phonolo- 
giques (par exemple en Théorie de l'Optimalité), mais également pour la morphologie, 
que ce soit la morphématique traditionnelle (ce qui est normal, puisque dans ces cadres 
les radicaux et les affixes sont des objets de la méme nature) ou la morphologie lexéma- 
tique dite « classique ». Dans ce contexte, une position emblématique me semble étre 
celle de Scalise (1999), qui, en traitant des noms déverbaux de l'italien, se demande « in 
amministrazione il suffisso sarà -azione, -zione o -ione? » (‘dans amministrazione le suf- 
fixe est-il -azione, -zione ou -ione?’), en suggérant simultanément qu'il est possible (et 
intéressant) d'identifier une forme précise pour le suffixe dans le dérivé en question - et 
par conséquent d'établir une frontière nette entre le suffixe et le radical - et que celui-ci 
peut potentiellement se présenter sous de différentes formes. 
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Dans cet article je vais proposer, au contraire, qu'une question comme celle ci-dessus 
n'est pas une question pertinente et que, si l'on se place dans un cadre morphologique 
orienté vers les outputs et basé sur les contraintes, la séquence formelle qui correspond 
à l'exposant d'une opération morphologique résulte uniquement de l'application d'une 
contrainte qui, en tant que telle, interagit et peut entrer en compétition avec les autres 
qui pésent sur la forme d'un mot construit. Si l'exposant d'une opération morphologique 
correspond lui-méme à une contrainte, il n'y a plus aucune nécessité théorique à ce qu'il 
ait une forme définie et constante dans l'ensemble des dérivés dans lesquels il apparait, 
y compris dans le cas par défaut. Au contraire, l'existence de plusieurs « allomorphes », 
par exemple pour un méme affixe, est prévisible, et ceux-ci peuvent étre hiérarchisés, 
puisque chacun d'entre eux permet la satisfaction d'un certain nombre de contraintes 
formelles, à leur tour potentiellement en concurrence. Plus généralement, j'adopte un 
cadre et un inventaire des contraintes qui, avec peu de modifications, sont ceux propo- 
sés par Plénat & Roché (2014) et Roché & Plénat (2014,2016). Il faut noter que le cadre 
dans lequel je me place, et la modélisation que je propose pour la variation des expo- 
sants des opérations morphologiques, est particuliérement adapté dans le cadre d'un 
modèle exemplairiste de la morphologie!. Les contraintes ne sont donc qu'un moyen de 
modéliser les préférences que les locuteurs manifestent dans leur activité de création 
morphologique; de ce point de vue, intégrer aux contraintes des propriétés purement 
déclaratives comme la forme d'un affixe est parfaitement légitime et en ligne, je consi- 
dére, avec les recherches citées, puisque cette propriété fait crucialement partie de celles 
que les locuteurs identifient dans les mots complexes existants et ont envie de reproduire 
dans ceux qu'ils construisent. 

Le modèle que je propose constitue l'état actuel de réflexions sur la forme des mots 
complexes que je méne depuis plusieurs années, et que j'ai déjà exposées dans des publi- 
cations antérieures. Si je remonte dans le temps, une des premiéres lectures qui m'ont 
poussé à réfléchir sur ce sujet est l'article de Fradin (2000) sur les mots-valises et ceux 
qu'il appelait « related phenomena »?. Cet article, qui propose une analyse et une clas- 
sification d'un large spectre de constructions morphologiques qui se détachent de l'af- 
fixation canonique, contient, entre autres choses, des données comme celles en (3, qui, 
en prenant comme modèle pérestroika, désignent des réformes politico-économiques qui 
ont eu lieu, respectivement, en France, à Cuba et en Afrique du Sud, ainsi qu'un renou- 
veau dans les mœurs sexuels dans l'ancienne URSS : 


(1) a. Béréstroika —— (Pierre) Bérégovoy 
b. Castroika < (Fidel) Castro 


Prétoriastroika < Prétoria 


9 


Sextroika 


IPar « exemplairiste », j'entends un modèle de la grammaire selon lequel les patrons (dans ce cas morpholo- 
giques) émergent dans la compétence des locuteurs à partir des lexémes existants auxquels ils sont exposés 
(cf. Bybee 2006, 2013; Blevins & Blevins 2009 pour des apercus récents). 

? Article que j'ai lu avant sa parution, puisque je le citais - comme « à paraitre » - dans mon mémoire de 
DEA de 1998. 

5Les mémes données sont reprises dans Fradin (2003 : 212-213). 
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Des données comme celles-ci sont clairement problématiques pour tout modèle qui es- 
sayerait d'appliquer mécaniquement un processus de combinaison de morphémes. Une 
des formes, Prétoriastroika, est clairement issue de la concaténation de deux éléments, 
mais les deux autres présentent différents degrés de fusion entre les éléments concernés. 
De plus, il semble y avoir une séquence phonologique ([stsajka]) qui, en français est obli- 
gatoirement présente dans ces mots complexes, et de ce point de vue elle peut à juste titre 
être considérée comme l’« exposant » de la construction morphologique. Cependant, 
le lexéme construit peut conserver une portion plus importante du matériel phonolo- 
gique du mot-modèle (comme dans le cas de Béréstroika), et la base peut être conservée 
dans sa totalité ou subir différents types de réajustements. Quelques-uns des mots de 
(1), notamment Béréstroika et Castroika, pourraient également étre analysés comme des 
mots-valises, puisque le partage de matériel phonologique est souvent considéré comme 
un élément essentiel de ce type de formations (Fradin 2000 : 28-31). Cependant, dans 
l'article en question Fradin montre de maniére convaincante, sur une base sémantique, 
que les formes de (1) sont bien des cas d'affixation (« sécrétive », puisque l'affixe pro- 
vient de la réduction d'un lexéme). À l'argument sémantique développé par Fradin on 
peut ajouter le fait que, à la différence des mots-valises, ces mots construisent une série, 
qui aurait certainement été plus importante, si les vicissitudes historiques n'avaient pas 
privé la pérestroika d'une grande partie de son impact politique et médiatique, et donc 
réduit de maniére cruciale la saillance du mot dans la conscience linguistique des locu- 
teurs. Une notion comme celle de série dérivationnelle, qui est aujourd'hui considérée 
comme un élément fondamental de l'organisation morphologique du lexique, ne faisait 
pas partie, à la fin des années 1990, des outils théoriques disponibles. Si les mots de (1) 
sont bien le résultat d'un processus d'affixation, une manière relativement simple de re- 
présenter l'exposant de cette construction morphologique est d'établir une contrainte 
qui veut que le dérivé se termine par la séquence phonologique [stsajka], qui peut être 
simplement agglutinée à une base (Prétoriastroika), mais qui peut aussi partager des seg- 
ments avec celle-ci (Castroika). En plus de proposer une proposition de classification des 
procédés morphologiques non canoniques fondée sur une analyse trés fine des proprié- 
tés formelles et sémantiques des éléments en question et sur des critéres solides, l'article 
en question, à mon sens, a joué un róle important sur un autre plan, à savoir l'identifica- 
tion des formations « mineures », marginales, apparemment étrangéres au « noyau » de 
la langue, comme des objets légitimes non seulement pour la lexicologie ou la lexicogra- 
phie, mais aussi pour une approche formelle du langage, et en particulier de la morpho- 
logie. Dans les années qui ont suivi, la prise en compte de tous les types de données, en 
particulier des données créées spontanément par les locuteurs dans des situations non 
contrólées, est devenue une pratique consolidée, et leur intérét théorique pour l'étude 
de la morphologie, surtout dérivationnelle, est admis. Ce développement est allé de pair 
avec l'expansion et la diffusion des ressources linguistiques, et donc l'élargissement pro- 
gressif des bases de données lexicales disponibles*. Dans ce contexte, et à une époque où 
les données de morphologues étaient encore pour la plupart puisées aux sources « tra- 


^La liste des travaux qui, surtout en France, ont adopté cette approche extensive à la morphologie, et des 
avancées théoriques qu'elle a rendues possibles irait certainement au-delà des finalités de cet article. Je me 
limite donc à citer quelques travaux qui proposent plutót une réflexion métathéorique sur le processus en 
cours et ses conséquences, par exemple Hathout et al. (2008); Hathout et al. (2009); Dal & Namer (2012, 
2016). 
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ditionnelles », Bernard Fradin (avec d'autres) a été un des premiers à voir l'importance 
des données « marginales » et à les exploiter pour nourrir la réflexion théorique. Cet 
article s'inscrit dans le méme mouvement de morphologie extensive fondée sur l'usage. 
En particulier, je m'appuierai, pour justifier le modèle de l'allomorphie affixale que je 
propose, sur deux études de cas de procédés morphologiques du francais de naissance 
ou de développement récents, pour lesquels les locuteurs ne disposent ni d'indications 
métalinguistiques (intégrées plus ou moins consciemment) sur leur fonctionnement, ni 
d'un nombre important de lexémes qui font partie du lexique établi et qui peuvent servir 
de modèles dans la création de nouveaux mots. Il s'agit, comme on le verra, de procédés 
qui sont partiellement en structuration, et pour lesquels les choix des locuteurs ne sont 
pas toujours univoques, puisque ceux-ci peuvent se fonder, dans la création lexicale, sur 
plusieurs indices, en attribuant un poids différent à chacun d'entre eux. Le premier phé- 
noméne que je vais regarder est la construction de noms (ou adjectifs) qui désignent les 
locuteurs d'une langue et qui sont construits au moyen de l'élément -phone (francophone, 
occitanophone, quechuaphone / quechuophone, wolophone), que je compare aux noms cor- 
respondants en italien (francofono, occitanofono, quechuofono, wolofono) (Section 3). Le 
deuxiéme est la construction de noms ou adjectifs (souvent, mais pas exclusivement, des 
noms commerciaux) au moyen du suffixe -(i)ssimo (Colissimo, Doctissimo, Tassimo, Ver- 
nissimo), que je compare aux adjectifs (et noms) construits au moyen du suffixe, plus 
établi, -issime (Section 4). Avant ces études empiriques, cependant, je propose quelques 
observations sur la prise en compte de la variation des exposants des constructions mor- 
phologiques dans un modèle fondé sur les contraintes, et je montre que ce paramètre 
n'est pas différent, dans la substance, des autres contraintes formelles qui pésent sur la 
forme des lexémes construits (Section 2). 


2 La variation des exposants dans un modèle 
morphologique à contraintes 


Pour beaucoup de linguistes, que ce soit dans des cadres formels ou plus descriptifs, le 
fait que les exposants d'opérations morphologiques puissent étre sujets à la variation 
formelle (ou, pour le dire plus simplement, l'existence de phénoménes d'allomorphie af- 
fixale) ne fait pas de doute. Ceci est méme attendu dans des modèles qui n'établissent au- 
cune distinction de nature entre les unités lexicales et les unités sublexicales (les affixes), 
si ce n'est dans leurs propriétés combinatoires et dans leur autonomie syntaxique. À titre 
d'exemple, les exposants des entrées consacrées par le TLFi aux suffixes qui construisent 
aimable et amabilité ont les formes, respectivement, « -able, -ible, -uble » et « -té, -eté, 
-ité ». De la méme manière, dans son ouvrage qui a contribué à l'établissement de l'ap- 
proche lexicaliste à la morphologie, Aronoff (1976 : 100), tout en reconnaissant que les 
affixes n'ont pas d'existence autonome en dehors des régles de construction de mots 
qui les introduisent, considére que le suffixe qui construit des noms d'action en anglais 
« has at least four, and possibly five, forms » : +Ation, +ition, +ution, +ion, *tion?. Dans 


?« + » est le symbole utilisé par Aronoff pour indiquer un type de frontière morphologique. 
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de tels cas, on considére implicitement qu’un affixe, qu’il ait une existence indépendante 
de la régle qui l'introduit ou pas, doit pouvoir étre représenté sous une forme discréte, 
et qu'il est donc toujours possible de tracer une frontiére entre celui-ci et le radical du 
lexéme de base, qui à son tour peut présenter ou pas une forme allomorphique. La varia- 
tion phonologique observée - qui, on remarquera en passant, concerne toujours la partie 
censée étre en contact avec la base - est paralléle à la variation allomorphique observée 
pour les lexémes, et peut étre traitée en faisant appel aux mémes conditionnements mor- 
phophonologiques. Un développement récent de la morphologie basée sur les lexémes a 
consisté à voir de plus en plus ces derniers comme des unités multiformes, mais struc- 
turées à leur intérieur, y compris du point de vue formel, une approche informellement 
nommée « morphologie thématique » (par exemple par Plénat 2008a, se référant à des 
travaux précédents, comme ceux de Bonami & Boyé 2003). Dans ce cadre, l'allomorphie, 
synchroniquement irréductible, observée pour certains lexémes est admise comme une 
propriété intrinséque de ceux-ci, encodée de facon explicite dans leur représentation 
lexicale. Le pendant de cet élargissement de la quantité d'information mémorisée par 
les locuteurs est une forte simplification des procédures morphologiques. En d'autres 
termes, la plus grande partie de la variation observée - et donc la plus grande com- 
plexité — est transférée du côté des bases (thèmes ou radicaux), avec une simplification 
des opérations morphologiques (flexionnelles ou dérivationnelles), et par conséquent de 
leurs exposants, qui sont, autant que possible, considérés comme uniques. L'article de 
Bonami et al. (2009) est un des cas dans lesquels cette approche a été illustrée de maniére 
la plus claire et convaincante. Dans la proposition de Bonami et collégues, le suffixe qui 
construit des noms d'action déverbaux en francais posséde une forme constante ([j5]), et 
la variation observée est à attribuer au théme verbal sélectionné par la régle de construc- 
tion de lexémes, un théme qui peut étre soit identique à un des thémes flexionnels du 
verbe (dispersion), soit autonome (modification, réduction). Comme je l'ai observé dans 
l'introduction, l'attention de la plupart de travaux réalisés dans le cadre de la morpho- 
logie thématique a tout naturellement porté sur la variation formelle des bases des pro- 
cessus de dérivation, en s'intéressant soit à la sélection du théme et aux modifications 
éventuelles qu'il subit (Plénat 2008a, Roché 2010, Roché & Plénat 2014, Hathout & Namer 
2014), soit aux cas de concurrence entre opérations (Lignon & Plénat 2009, Lignon 2013, 
Koehl & Lignon 2014, Roché & Plénat 2016, entre autres). À ma connaissance, un des 
rares travaux dans ce cadre à traiter explicitement la question de l'allomorphie affixale 
est l'article de Lignon & Roché (2011), qui, dans la construction des adjectifs de relation 
en francais, identifient -éen et -ien comme « deux variantes d'un méme suffixe -ien » 
(Lignon & Roché 2011 : 191). D'autres cas, y compris des cas traditionnellement identi- 
fiés comme relevant de phénoménes d'allomorphie affixale, sont en revanche traités de 
maniére moins claire et univoque. Je montre, à titre d'exemple, deux cas tirés de la litté- 
rature récente sur le francais, celui des semi-voyelles présuffixales dans certains dérivés 
(en particulier en -eux)°, et celui du suffixe qui construit des noms de qualité comme 
rareté ou amabilité. Des formes comme ambitieux, injurieux ou luxueux, qui comportent 


SL’étiquette de « semi-voyelle présuffixale » est inspirée de Thornton (1999), qui a consacré un article au 
méme phénomène en italien. 
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une semi-voyelle ([j] ou [w]) à la jonction entre la base et l'affixe sont souvent regardées 
comme comportant une forme allomorphique du suffixe, dont la distribution peut étre 
déterminée par des contraintes de type phonologique et/ou morphologique. Le TLFi, par 
exemple, liste -ieux et -ueux comme des variantes du suffixe - eux. Des traitements plus ré- 
cents, cependant, tendent à traiter les séries de lexémes se terminant en -ieux / -ueux soit 
comme des cas d'allomorphie radicale (celle-ci semble étre la position exprimée par Bo- 
nami et al. 2009 : 104-105), ou bien, tout simplement, comme des sous-séries des lexémes 
en -eux qui, puisqu'elles comportent de nombreux lexémes (dont un grand nombre di- 
rectement issu du latin) et qu'elles sont uniformes, tendent à s'enrichir encore plus (cf. 
Roché 2011 :86; Roché & Plénat 2014). Dans ce cas, l'identification de la semi-voyelle 
comme appartenant à un allomorphe du théme de base ou à une variante du suffixe perd 
une grande partie de son intérêt, puisque « [l]es divers processus qui tendent à enrichir 
la rime se confondent et s'interpénétrent » (Roché & Plénat 2014 : 1867). La situation est 
encore moins claire en ce qui concerne les noms désadjectivaux de qualité se terminant 
en [te]. Plénat et Roché semblent considérer -ité et -(e)té tantót comme deux variantes du 
méme suffixe (Roché 2011 : 80; Roché & Plénat 2012 : 1395) , tantót comme deux suffixes 
liés (ne serait-ce que du point de vue diachronique) mais distincts (Plénat 2008a : 1617; 
Roché & Plénat 2014 : 1865, 1869), tandis que Koehl (2012 : 173) indique explicitement que 
« -ité et -té sont deux variantes allomorphiques d'un méme suffixe noté -Ité ». Ces deux 
exemples, en soi anecdotiques mais tout de méme significatifs, montrent, à mon sens, que 
la voie qu'a empruntée la morphologie thématique - se poser des questions différentes 
de « quelle est la frontiére entre le radical et l'affixe dans le lexéme construit X? » - est 
la bonne, mais qu'elle ne s'est pas entiérement débarrassée de certains réflexes propres 
de la morphologie combinatoire classique (par exemple, identifier une forme discréte et 
si possible univoque pour un affixe). Dans ce qui suit, je voudrais contribuer à pousser 
davantage la morphologie sur la voie que j'ai évoquée, en développant, en particulier, 
trois points : i) toute la variation formelle observée en dérivation ne peut pas étre uni- 
quement attribuée à la variation thématique des bases; il existe des cas ot la variation 
ne peut clairement pas étre attribuée à la sélection d'un théme particulier, mais reléve 
de l'exposant; ii) il est nécessaire de distinguer les cas dans lesquels un ensemble de 
lexémes est issu de la méme construction, qui présente une variation de l'exposant, des 
cas dans lesquels on a affaire à plusieurs ensembles de lexémes issus de constructions 
différentes avec des exposants différents (qui peuvent, éventuellement, présenter une 
similarité formelle et/ou sémantique); iii) lorsqu'on a affaire à un ensemble de lexémes 
issus de la méme construction qui présente une variation de l'exposant, cette variation 
peut étre décrite sous forme de contraintes hiérarchisées du méme type que les autres 
contraintes qui pésent sur la forme des mots construits. Aux deux premiers points est 
consacrée la section 2.1, au troisiéme la section 2.2. 


2.1 La variation formelle des exposants 


Comme je l'ai observé, la morphologie thématique a adopté, comme principe général, 
l'idée que la variation formelle rencontrée dans les mots complexes était plus avantageu- 
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sement traitée en termes de supplétion thématique plutót que de variation de l'exposant. 
L'intérét de ce mouvement se comprend facilement, en particulier lorsqu'on considére 
que ce modèle a été conçu d'abord pour traiter des phénomènes flexionnels (principale- 
ment dans les langues romanes) : l'hypothése de l'allomorphie thématique est d'autant 
plus facile à maintenir que les formes fléchies présentent peu de variation dans leurs 
exposants (terminaisons), et dans la plupart des cas il s'agit d'allomorphies qui peuvent 
étre ramenées à une variation de classe flexionnelle. En revanche, il existe un certain 
nombre de phénoménes de variation thématique qui ne peuvent étre traités, synchro- 
niquement, qu'en termes de supplétion’. Si postuler l'existence de supplétions théma- 
tiques, au moins à un certain degré, est donc nécessaire, il est plus économique d'alléger 
le dispositif de régles, en associant, autant que possible, une seule instruction formelle 
à chaque construction morphologique?. Ce modèle, toutefois, s'il est convaincant dans 
beaucoup de cas, ne permet pas de rendre compte de l'ensemble des variations obser- 
vées. L'incertitude dont j'ai fait état ci-dessus concernant les suffixes (pour faire vite) 
-eux et -ité me parait emblématique de ce fait. Il existe, en effet, de nombreux cas de 
dérivation pour lesquels l'hypothése d'une variation de l'exposant est bien plus convain- 
cante que l'hypothése d'une supplétion thématique. Lignon & Roché (2011), par exemple, 
consacrent plusieurs pages à une démonstration trés solide du fait que -ien, -éen et - 
ain (et méme -en) sont autant d'« allomorphes » d'un exposant unique de construction 
morphologique qu'ils transcrivent -1EN. Une explication en termes de variation de l'ex- 
posant devrait étre invoquée, me semble-t-il, également pour les cas de substitution de 
-este à -esque (grandiloqueste, titaniqueste) étudiés par Plénat, Tanguy et al. (2002). Le 
fait que dans ce dernier cas les deux variantes aient des origines différentes (le suffixe 
latin -iscus via l'italien dans un cas, et le suffixe -estis dans l'autre) importe peu en syn- 
chronie, si les deux variantes sont employées en distribution complémentaire sur la base 
de la forme phonologique de la base, comme le montrent Plénat et collégues. Des cas 
dans lesquels nous avons affaire trés probablement à une variation de l'exposant plutót 
que du théme de base sont également trés nombreux en préfixation, en francais et dans 
d'autres langues. C'est le cas, par exemple, des trois variantes du préfixe négatif qui est 
orthographié in- (ou il-, im-, ir-) et qui se présente sous les formes [in], [i] et [£] qui 
sont, au moins partiellement, en distribution complémentaire (cf. Apothéloz 2003); c'est 
le cas aussi des préfixes, comme sous-, pour lesquels existe une variante comportant une 
consonne « de liaison » (sous-alimentation, sous-entendre). Dans tous ces cas, imaginer 
la variation observée comme supplétion thématique semblerait peu naturel, voire impos- 
sible dans certains cas comme in-. Certes, on pourrait soutenir, comme il a été souvent 
avancé, que la préfixation et la suffixation différent par nature, et que la première fait in- 
tervenir des unités qui présentent une plus grande autonomie, et donc plus de variabilité. 
Cependant, il existe de trés bons arguments pour refuser l'idée qu'il existe une différence 
substantielle entre ces deux procédés dérivationnels, et le cadre que j'adopte est juste- 


7Pour un examen critique de la morphologie thématique appliquée à la flexion qui aboutit à des conclusions 
sensiblement semblables à celles défendues ici, cf. Bonami (2014 : 34-84); Bonami & Boyé (2014 : 18-22). 

5Naturellement, sont exclus de ce raisonnement les cas dans lesquels un exposant dérivationnel posséde des 
formes différentes dans différentes instances du méme lexéme (c'est-à-dire construit plusieurs thémes à la 
fois), comme par exemple [j£], [jen], [jan] dans italien, italienne, italianiser. 
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ment un cadre dans lequel l'ensemble des procédés morphologiques constructionnels 
correspond à des opérations de la méme nature, avec, tout au plus, un continuum dé- 
terminé par l'autonomie plus ou moins grande des éléments concernés (cf. Lasserre & 
Montermini 2014). 

Les exemples mentionnés ci-dessus montrent bien que, dans une relation de morpho- 
logie constructionnelle, la variation (dans des termes plus traditionnels l'allomorphie) 
peut concerner soit les thémes du lexéme de base, soit l'exposant (éventuellement les 
deux à la fois), et que, dans certains cas on est bien face à des exemples d'« allomorphie 
affixale ». Si c'est le cas, le premier probléme qui se pose est celui d'identifier, lorsque 
nous observons une variation formelle dans un ensemble de dérivés similaires, s'il s'agit 
bien d'un cas d'allomorphie de l'exposant, ou bien de deux ou plusieurs constructions 
différentes dont les exposants présentent des similarités formelles et/ou sémantiques. La 
táche est certainement compliquée par le fait que les cas d'« échangisme affixal », dans 
lesquels les locuteurs choisissent, pour une base donnée, un affixe équivalent ou méme 
moins adapté sémantiquement que celui attendu parce qu'il apparait comme préférable 
du point de vue formel (cf. entre autres Lignon & Plénat 2009, Lignon 2013, Roché 2013), 
sont avérés et fréquents. Il me semble qu'il y a au moins deux facteurs qui peuvent étre 
invoqués pour identifier une variation comme étant une allomorphie affixale. Premié- 
rement, les différentes variantes doivent étre assez semblables phonologiquement pour 
pouvoir étre identifiées par les locuteurs comme relevant du méme exposant de construc- 
tion, par exemple en manifestant des alternances qui sont phonologiquement naturelles 
et/ou qui s'observent dans d'autres cas dans la langue. C'est le cas, par exemple des seg- 
ments « fluctuants » que l'on observe dans les différentes variantes de -IEN (mais aussi 
devant -eux), de l'assimilation dans in-, ou de l'émergence d'une consonne « latente » 
dans sous-. Naturellement, cette homogénéité formelle doit toucher toutes les formes du 
méme exposant qui apparaissent dans les thémes qu'il permet de construire. C'est ce der- 
nier critére, par exemple, qui permet de rassembler -ien, -éen et -ain en tant que variantes 
de l'exposant d'une seule construction, mais de distinguer le -in qui construit aussi des 
gentilés (alpin, girondin), puisque les lexémes qu'il permet de dériver possédent la méme 
finale que les suffixes ci-dessus au théme A (celui des formes du masculin), mais pas 
au thème B (celui des formes du féminin)”. Deuxièmement, le contexte d'apparition des 
différentes variantes doit étre clairement identifiable du point de vue phonologique ou 
morphologique. Dans le meilleur des cas, les différentes variantes sont en distribution 
complémentaire parfaite; dans la pratique, cependant, il est plus vraisemblable d'obser- 
ver des préférences pour une variante ou pour une autre selon la forme phonologique de 
la base. Tous les travaux mentionnés ci-dessus (Lignon & Roché 2011 sur -IEN, Plénat, Li- 
gnon et al. 2002 sur -esque, Apothéloz 2003 sur in-) montrent en effet en premier lieu que 
le choix de l'une ou de l'autre variante ne se fait jamais de facon déterministe, et que la 
variation est la condition normale d'existence de toutes ces constructions. En revanche, 
l'origine commune ou d'autres propriétés extralinguistiques ne sont évidemment pas de 
bons critéres pour décider du statut de deux variantes comme relevant de deux construc- 
tions différentes ou de la méme. Plénat (2008b) et Roché & Plénat (2016) ont par exemple 


?Pour l'étiquetage des thémes, j'utilise les mémes conventions que Plénat (2008a) ou Roché (2010). 
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montré que la distribution de -ais ou -ois comme suffixe pour la construction des gentilés 
(que l'on pourrait étre tenté de considérer comme les deux allomorphes d'un seul suffixe, 
puisqu'ils proviennent du méme suffixe latin et ils construisent, de facon paralléle, un 
théme B en [z]) repose, au moins en partie, sur des critéres géographico-historiques, ce 
qui pousse à les considérer comme les exposants de deux constructions morphologiques 
distinctes, bien que, évidemment, reliées du point de vue sémantico-fonctionnel. 

Une fois que nous avons établi que toute la variation observée en dérivation ne peut 
pas étre attribuée uniquement à la supplétion thématique, et qu'un certain nombre de 
phénoménes ne peuvent étre analysés qu'en termes de variation des exposants, il nous 
reste à établir comment modéliser cette variation des exposants dans un cadre de mor- 
phologie thématique, et comment elle interagit avec les mécanismes de sélection des 
thémes. 


2.2 Les exposants morphologiques en tant que contraintes 


Une facon simple et à mon sens efficace de représenter la variation des exposants dans 
un cadre comme celui adopté dans ce travail est de considérer les exposants eux-mémes 
comme des contraintes. En d'autres termes, l'exposant d'une construction morpholo- 
gique peut étre envisagé comme un ensemble de contraintes formelles sur la forme de 
ses outputs. Plus précisément, je considére que chaque construction morphologique spé- 
cifie un ensemble de propriétés formelles, prosodiques ou segmentales, que ses dérivés 
doivent avoir. Dans ce cas, il s'agit donc de contraintes spécifiques à chaque construction 
dont la satisfaction est bien entendu conditionnée à la satisfaction d'autres contraintes, 
universelles ou spécifiques à chaque langue. Comme dans les modèles classiques qui em- 
ploient cet outil, les contraintes peuvent être contradictoires entre elles — et dans ce cas 
étre hiérarchisées, de facon stable ou variable — ou, au contraire, converger, et donc se 
renforcer mutuellement (Plénat & Roché 2014 : 51, qui s'inpirent de Burzio 2002). L'exis- 
tence de contraintes prosodiques (par exemple concernant la taille optimale d'un mot 
construit) a été observée et discutée depuis longtemps, en particulier sur le francais (cf. 
Plénat 2009 pour un apercu). Plus récemment, la structure segmentale des lexémes déri- 
vés, notamment dans les cas où l'on observe un écart entre la forme attendue et la forme 
attestée, a aussi été décrite en termes de contraintes. En particulier, Roché & Plénat 
(2014 : 1868) identifient deux contraintes, qu'ils nomment, respectivement, « Contrainte 
de famille » et « Contrainte de série », dont la finalité, globalement, est de faire en sorte 
qu'un lexéme dérivé soit le plus semblable possible à d'autres lexémes reliés, soit parce 
qu'ils appartiennent à la méme famille (et donc sont construits sur le méme lexéme de 
base), soit parce qu'ils appartiennent à la méme série (et donc sont construits au moyen 
du méme procédé morphologique). La contrainte de série, en particulier, rend compte du 
fait que le méme suffixe tend à sélectionner des thémes de base le plus possible similaires 
du point de vue segmental. Ceci explique, entre autres, l'émergence, au sein de la méme 
série dérivationnelle, de sous-séries homogènes. Des cas de cooccurrence suffixale ou la 
fréquence de certaines séquences avant un affixe (entre autres, -titude, -inette, -alisme, 
-anisme, -ariat, -orat, -inat, etc., cf. Plénat & Roché 2014 pour un aperçu) ont été analy- 


432 


17 Les affixes dérivationnels ont-ils des allomorphes ? 


sés en termes de contrainte de série. Globalement, la contrainte de série, donc, garantit 
que tous les mots dérivés par la méme construction (qui appartiennent a la méme série) 
soient les plus semblables possibles dans leur partie droite (dans le cas de la suffixation). 
Dans le modéle développé par Plénat et Roché, ceci peut correspondre au moins a deux 
types d’opérations, qui a leur tour peuvent étre réparties en sous-groupes : 


i) la sélection d'un théme, qui peut étre : 


a) un théme du lexéme de base qui apparait aussi dans d'autres dérivés, par 
exemple le théme qui apparait dans snobinard pour snobinat, construit sur 
snob (Plénat & Roché 2014, cette opération permet de satisfaire simultané- 
ment la contrainte de série et la contrainte de famille); 


b) le théme d'un autre lexéme appartenant à la méme famille morphologique, 
par exemple personnal- (théme savant de personnel) dans personnalisme, qui, 
sémantiquement, est construit sur personne (Roché 2009 : 159); 


ii) la création ex novo d'un radical!” à partir d'un thème du lexéme de base, qui peut 
se faire; 


a) par troncation, par exemple dans végétariat construit sur végétarien (Plénat 
& Roché 2014 : 67). Cette opération permet également de satisfaire des con- 
traintes prosodiques sur la taille des dérivés; 


b) par adjonction d'une séquence, par exemple dans geekariat construit sur 
geek (Plénat & Roché 2014 : 69); 


c) par manipulation du théme, par exemple dans les dérivés de gouverneur, 
gouvernorat, gouvernatorat, gubernatorat, etc. (Plénat & Roché 2014 : 59), qui 
reconstruisent des thèmes savants pour un lexéme qui, en français, en est 
normalement dépourvu. 


Toutes les opérations décrites ci-dessus ont le but d'inclure les lexémes construits 
dans celles que Plénat et Roché appellent des « sous-séries lexicales », c'est-à-dire des 
ensembles de lexémes dérivés par la méme construction morphologique qui, du point de 
vue segmental, partagent plus que l'exposant de la construction en question, en l'occur- 
rence [ina] ou [bga] pour la suffixation en -at, et [alism] pour la suffixation en -isme. Plus 
une sous-série est grande, plus elle sert de póle d'attraction pour de nouveaux lexémes, 
quitte à induire la sélection d'un théme non optimal du point de vue sémantique (comme 
dans personnalisme), ou bien une manipulation du théme (comme dans les cas en (ii) ci- 
dessus), en entrainant, dans les deux cas, une violation de la contrainte de fidélité base- 
dérivé. Plusieurs cas de combinaisons d'affixes du français, plus ou moins justifiées du 
point de vue sémantique, ont été traités dans la perspective d'une inclusion de lexémes 
impliqués dans des sous-séries morphologiques (cf. Roché 2009, 2011, Namer 2013, Li- 
gnon et al. 2014). Dans d'autres cas, cependant, les segments qui permettent d'identifier 
une sous-série ne correspondent pas nécessairement (ou du moins ne correspondent plus 


Y Sur la distinction entre « thème » et « radical » cf. en particulier Roché (2010). 
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en synchronie) à un affixe; c'est le cas de la sous-série -inat pour -at (cf. ci-dessus), mais 
également de la sous-série -titude pour -itude (Plénat & Roché 2014 : 53), -acisme pour 
-isme (Roché 2011 : 85), etc. Toutes ces séquences (qu'elles proviennent de suffixes syn- 
chroniquement analysables ou pas) ont uniquement une fonction formelle et lexicale, 
puisqu'elles permettent de réduire la dispersion à l'intérieur des séries morphologiques 
et contribuent, donc, à les rendre plus homogénes. A bien regarder, de ce point de vue il 
n'y a pas de distinction de substance entre ces séquences et les séquences que tradition- 
nellement nous acceptons comme étant des affixes. Dans un cadre théorique qui ne re- 
connaît pas d'existence autonome aux affixes en dehors des opérations morphologiques 
dont ils sont les exposants, ceux-ci peuvent étre concus simplement comme des asso- 
ciations arbitraires de séquences de segments à une construction. Leur róle n'est autre 
que de permettre de reconnaitre qu'un lexéme a été dérivé au moyen d'une construc- 
tion donnée, et donc d'avoir des constructions qui, du point de vue formel, sont les plus 
homogénes possibles. Comme je l'ai évoqué plus haut, je propose donc de concevoir 
toutes les séquences formelles qui permettent d'identifier des séries ou des sous-séries 
morphologiques comme des contraintes, dérivant, en particulier, d'un élargissement de 
la contrainte de série, pour laquelle je propose la formulation suivante : 


(2) Contrainte de série : tous les lexémes relevant de la méme série morphologique 
sont identiques. 


La formulation ci-dessus est délibérément vague, pouvant englober aussi bien les pro- 
priétés formelles que les propriétés sémantiques des lexémes dérivés (quel que soit le mo- 
déle sémantique auquel on se référe). Si elle peut paraitre paradoxale, elle est à mon avis 
suffisante pour rendre compte de l'ensemble des propriétés des lexémes construits ap- 
partenant à la méme série. D'un cóté, la contrainte de série est contrecarrée par d'autres 
contraintes, en premier lieu par la contrainte de famille", qui met en relation chaque 
lexéme avec les lexémes construits sur la méme base et, de fait, empéche que la contrainte 
de série ait pour effet de rendre tous les lexémes de la méme série identiques. De l'autre 
cóté, dans les faits tous les membres de la méme série morphologique partagent des élé- 
ments de forme qui sont communs et occupent toujours la méme place, ce qui donne lieu 
à l'identification d'exposants qui, du moins en français, sont généralement des préfixes 
ou des suffixes. Il est possible, de plus, que dans certains cas il soit utile de considérer 
la contrainte de série, dans la formulation que j'en ai donnée, comme pondérable selon 
la fréquence et la saillance des lexémes dans une série donnée. Puisque généralement 
tous les lexémes de la méme série ne partagent jamais tous leurs segments, on peut 
imaginer que les nouveaux lexémes qui rentrent dans une série tendent à s'aligner, for- 
mellement, plutót aux lexémes les plus fréquents ou saillants de celle-ci. Dans des cas 
extrêmes, où une série contient un lexéme qui, pour différentes raisons, joue un rôle de 
lexéme prototype (un « leader word », selon les termes de Rainer 2003 ou Roché 2011), 


"Parallélement, on pourrait imaginer une Contrainte de famille qui stipulerait que tous les lexémes de la 
méme famille sont identiques. De telles contraintes, contraires et ayant la méme force, auraient pour effet 
de s'annuler réciproquement, en empéchant, de fait, que tous les lexémes de la méme famille ou de la méme 
série soient identiques, mais rendant compte du fait qu'ils partagent la plupart de leurs propriétés. 
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celui-ci constitue le modéle auquel les autres lexémes tendent à ressembler, y compris 
du point de vue formel. C'est le cas des lexémes appartenant à la série donnée en (1) 
dans l'introduction, dans laquelle pérestroika est de loin le lexéme le plus saillant, puis- 
qu'il en est à l'origine. Dans ce cas, la forme des nouveaux lexémes inclus dans la série 
(peu nombreux, au final) est évaluée, par rapport à la contrainte de série, en fonction de 
leur similarité principalement avec ce lexéme prototype, ce qui explique que différents 
lexémes (par exemple Béréstroika ou Castroika) aient pu retenir des portions variables 
dans leur exposant. 

Concrétement, nous pouvons imaginer que la contrainte donnée en (2) se décline 
en contraintes et sous-contraintes plus spécifiques qui, pour chaque construction, dé- 
finissent les segments que les mots de la série correspondante partagent et leur position. 
Plénat & Roché (2014 : 54) eux-mémes évoquent l'idée qu'une construction morpholo- 
gique puisse étre considérée « comme une macro-contrainte résultant de la présence 
dans le lexique d'une série de mots ». Pour reprendre et développer le cas discuté par 
eux des noms en -at du français, leur représentation formelle peut être vue comme com- 
portant les contraintes [Xa], [Xaxja], [Xika], [Xona], [Xoga], etc. (cf. la liste donnée par 
Plénat & Roché 2014 : 54). Le fait que les sous-contraintes [Xaxja], [Xika], [Xona], [Xoga] 
soient partiellement en contradiction les unes avec les autres n'est évidemment pas pro- 
blématique, dans un cadre dans lequella satisfaction simultanée de toutes les contraintes 
n'est pas indispensable. Les mémes contraintes peuvent étre considérées comme étant 
dans une relation de « Elsewhere Condition » avec la contrainte plus générale : celle- 
ci correspond au choix par défaut adopté au cas où d'autres contraintes empécheraient 
les sous-contraintes plus spécifiques d'étre satisfaites. L'idée que des contraintes de ce 
type soient dans une telle relation hiérarchique est cruciale dans ce cadre. Dans les faits, 
il est en effet évident que, toute chose égale par ailleurs, les lexémes issus de la méme 
construction tendent à présenter toujours la méme forme d'exposant, qui correspond 
donc à sa forme par défaut. Ce cas par défaut peut, comme dans le cas général discuté 
ici, correspondre à une forme sous-spécifiée par rapport aux autres ([Xa]), mais il peut 
aussi correspondre à une forme qui a le méme degré de spécification que les autres, mais 
qui est plus fréquente dans la série en question. Pour expliquer des formes en -at comme 
hótessariat, shérifariat, victimariat, etc., Plénat & Roché (2014 : 71) observent qu'« il faut 
que -ariat soit devenu, pour certains locuteurs, la forme par défaut du suffixe ». L'exis- 
tence d'une « forme par défaut » de marqueurs morphologiques a été observée dans 
plusieurs cas. Lignon & Roché (2011 : 191), par exemple, indiquent -ien, -éen, -ain et -en 
comme formes possibles pour le suffixe -1EN, avec la premiére variante qui a la forme 
par défaut. Dans des travaux antérieurs (Montermini 2010, 2015), j'ai soutenu une posi- 
tion semblable pour les suffixes cognats de l'italien. En prenant en compte des données 
néologiques comme celles en (3), j'ai soutenu que l'exposant en question posséde une 
forme sous-spécifiée [Vano], dont la position V est remplie par défaut par un segment [j] 
lorsque la base n'est pas problématique pour la phonologie de l'italien (finale en voyelle 
simple non accentuée ou en consonne : calcuttiano, hannoveriano), ou par une voyelle 
fournie par la base, lorsque celle-ci présente une finale problématique (voyelle accentuée, 
hiatus, diphtongue); enfin, la forme [ano] non précédée par une voyelle émerge trés ma- 
joritairement avec des bases qui se terminent par une voyelle [a] atone (wojtylano). 
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(3) a. calcuttiano — Calcutta 
b. hannoveriano <— Hannover 
c. deandreano < (Fabrizio) De André 
d. murnauano < (Friedrich) Murnau 


e. pessoano <— (Fernando) Pessoa 


f. wojtylano — (Karol) Wojtyla 


Les contraintes qui correspondent aux différentes variantes d'un affixe peuvent donc 
étre elles-mémes dans des relations hiérarchiques, avec généralement une forme qui, 
par rapport aux autres, a le statut de forme par défaut. Cette relation hiérarchique peut 
prendre au moins deux formes : i) la forme par défaut est une forme sous-spécifiée par 
rapport aux autres ([Xa] vs. [Xagja], [Xika], [Xona], [Xoxa]); ii) la forme par défaut a 
le méme degré de spécification que les autres formes, mais est plus fréquente dans la 
série correspondante (-ien vs. -éen, -ain, -en), voire est plus spécifiée ([jano] vs. [Vano] 
en italien). Naturellement, les formes qui ne correspondent pas au défaut peuvent elles- 
mémes étre dans une relation hiérarchique. Ainsi, dans le cas des noms en -at du frangais, 
selon ce que disent Plénat & Roché (2014), [Xaxja] semble fonctionner comme un défaut 
secondaire, plus fréquent dans la série, et donc plus disponible, que les autres variantes. 

Comme je l'ai observé plus haut, les contraintes qui correspondent à la forme phonolo- 
gique des exposants des constructions morphologiques (que je considère, je le rappelle, 
comme autant de sous-contraintes d'une contrainte de série plus générale qui a la forme 
en (2)), interagissent naturellement avec les autres contraintes formelles qui pésent sur 
les mots construits. Par exemple, les contraintes relatives à la structure segmentale des 
lexémes construits en -at du francais, indiquées ci-dessus, sont associées à une contrainte 
plus générale du francais qui demande qu'un lexéme construit comporte, préférentiel- 
lement, trois syllabes. De méme, ces contraintes segmentales entrent en relation avec 
des contraintes généralement considérées comme universelles, comme des contraintes 
phonologiques anti-marque, ou une contrainte de fidélité base-dérivé. Quelques-uns des 
lexémes de (3) exemplifient ce fait. Une forme comme wojtylano, par exemple, respecte 
la contrainte de fidélité base-dérivé, ainsi qu'une contrainte phonologique générale qui 
défavorise les séquences de voyelles identiques (qui serait violée par * wojtylaano), mais 
viole partiellement la contrainte segmentale [Vano]. La forme alternative wojtyliano, éga- 
lement attestée, au contraire, respecte cette derniére contrainte (et méme la hiérarchie 
qui indique [jano] comme forme par défaut), mais peut étre considérée comme moins 
optimale du point de vue de la fidélité base-dérivé, puisque la voyelle finale de la base 
est effacée. De son cóté, deandreiano respecte la contrainte de fidélité base-dérivé (tous 
les segments de la base s'y retrouvent) et respecte aussi la contrainte segmentale [Vano], 
méme si elle favorise une variante du suffixe moins haute dans la hiérarchie des formes 
possibles. 

À partir de ce qui est dit ci-dessus, il est évident qu'une question comme « qu'est-ce 
qui appartient à la base et qu'est-ce qui appartient à l'affixe? » n'est plus une question 
pertinente. Si nous voulons à tout prix voir les choses dans ces termes, dans deandreano le 
segment [e] « appartient » à la fois à la base et à l'affixe. Dans des termes plus appropriés 
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pour le modèle défendu ici, l'émergence du segment [e] permet de satisfaire plusieurs 
contraintes formelles à la fois. Il est évident, donc, que dans ce cadre une notion théo- 
rique comme celle de « frontiére morphologique », qui a été une notion importante dans 
plusieurs modéles théoriques (par exemple la Phonologie Lexicale ou la Morphologie 
Naturelle) ne joue plus aucun róle. Dans les exemples en question, il n'y a pas de « fron- 
tiére », puisqu'il n'y a pas deux éléments accolés l'un à l'autre, mais plutót l'application 
d'un ensemble de contraintes formelles à une forme (un théme). Comme on le voit, ce 
pas est particuliérement cohérent avec le mouvement progressif de « déréification » des 
exposants morphologiques que la recherche en morphologie a mis en ceuvre dans les 
dernières décennies. 

Avant de conclure, observons que les contraintes segmentales sur la forme des lexémes 
construits, qui correspondent à leurs exposants, sont des contraintes d'un type particu- 
lier. Alors que les contraintes, au sens classique, sont censées capter des propriétés gé- 
nérales, voire universelles, des langues, ici il s'agit de contraintes hautement spécifiées 
et dont le domaine d'application est fortement restreint. Cependant, le modéle de mor- 
phologie à contraintes dont je m'inspire combine déjà des contraintes universelles avec 
des contraintes spécifiques à une langue donnée (dans ce cas le français), et même des 
contraintes spécifiques à une sous-partie de la langue à un stade d'évolution donné et li- 
mitées à une de ses modalités (par exemple la « Contrainte de fidélité phonographique », 
Roché & Plénat 2014 : 1873). S'il est légitime d'avoir de telles contraintes non seulement 
non universelles, mais limitées à des secteurs de la langue, il me semble que rien n'em- 
péche, du point de vue conceptuel, d'avoir des contraintes limitées à des constructions 
particuliéres, d'autant plus que les contraintes sur la forme des dérivés identifiées ci- 
dessus sont issues d'une contrainte plus générale, la contrainte de série qui, elle, peut 
prétendre au statut de contrainte universelle de la morphologie. 

Pour conclure cette section, avant de passer à l'illustration des cas concrets étudiés 
dans la section 3, je récapitule les différents éléments de la proposition que j'ai avancée 
pour rendre compte de la forme des outputs des constructions morphologiques. Tout 
d'abord, la forme d'un lexéme construit est régie, entre autres, par une contrainte de 
série qui stipule qu'il doit étre le plus semblable possible, y compris du point de vue 
segmental, aux autres lexémes de la méme série. Pour chaque construction individuelle, 
cette contrainte prend la forme de contraintes plus spécifiques qui stipulent les segments 
qu'un dérivé de la série doit contenir pour étre considéré comme tel, et leur position (ce 
qui correspond à l'affixe au sens traditionnel). Ces contraintes plus spécifiques peuvent 
étre multiples, ce qui rend compte de la variation observée pour les exposants morpho- 
logiques; elles peuvent étre en contradiction les unes avec les autres ou se renforcer 
mutuellement, et peuvent étre hiérarchisées, avec, dans le cas le plus courant, une des 
variantes qui fonctionne comme le défaut. La forme des lexémes construits réellement 
observée est déterminée par l'interaction de ces contraintes segmentales avec les autres 
contraintes formelles, en particulier la contrainte de famille et celles qui sont respon- 
sables pour la sélection du théme du lexéme de base. Roché & Plénat (2014) ont montré 
plusieurs exemples dans lesquels la sélection du théme de base (ou sa manipulation) a 
pour but de satisfaire la contrainte de série et/ou la contrainte de famille. Dans la section 
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qui suit, je discuterai des cas dans lesquels cette sélection interagit également avec la 
hiérarchie des contraintes segmentales qui correspondent a la forme de l'exposant des 
constructions. Parfois, un théme spécifique est sélectionné en vertu de sa compatibilité 
avec une des formes de l'exposant qui est haut placée dans la hiérarchie; dans d'autres 
cas, c'est une forme moins haute dans la hiérarchie qui émerge parce qu'elle est plus 
compatible avec le théme de la base sélectionné, par exemple parce que d'autres thémes 
ne sont pas disponibles. 


3 Le jeu des contraintes dans l'identification de la forme 
des dérivés : deux études de cas 


Dans cette section, j'applique le modéle esquissé dans la section 2 à trois exemples de 
constructions morphologiques. Je montrerai en particulier que l'exposant d'une construc- 
tion posséde un ensemble de formes possibles, dont l'émergence dépend de l'interaction 
avec les autres contraintes en jeu (en premier lieu la contrainte de fidélité base-dérivé). 
Dans tous les cas, j'indiquerai les exposants dans le texte avec une forme arbitrairement 
choisie (généralement la forme par défaut) écrite en petites majuscules (-PHONE, -ISSIMO, 
etc.), en suivant ainsi la convention adoptée par Lignon & Roché (2011) et celle générale- 
ment admise pour les lexémes. 

Le premier cas étudié est la construction de noms (ou adjectifs) qui désignent les locu- 
teurs d'une langue et qui sont construits au moyen de l'élément -PHONE en frangais, qui 
sont comparés aux noms issus de la construction correspondante en italien (-FONO). Cet 
exemple montre comment deux constructions similaires (et cognates) dans deux langues 
proches peuvent présenter des propriétés formelles (et donc un jeu de contraintes seg- 
mentales) différentes. En italien, en effet, la forme de l'exposant comporte sans exception 
un [o] accentué (issu de l'élément de composition grec), alors qu'en français un segment 
de timbre /o/ est présent uniquement dans la forme par défaut de l'exposant, mais sa 
position peut étre occupée par une autre voyelle (quechuaphone, ewephone) et méme par 
une consonne (ocphone, pularphone), le timbre de ce segment étant corrélé à la forme du 
théme de la base. Les constructions de noms de locuteurs en -PHONE / -FONO ont éga- 
lement la particularité de sélectionner des bases de complexité variable : dans certains 
cas la base est un lexéme qui appartient à une famille nombreuse, qui posséde donc un 
espace thématique riche et peut par conséquent donner lieu à une grande variation des 
dérivés (à partir de portugais j'ai recensé lusophone, lusitophone, portugaisophone, portu- 
galophone, portugophone) ; dans d'autres cas, la base est un nom de langue qui n'est relié 
à aucun autre lexéme dans le lexique, qui posséde parfois une structure phonologique in- 
habituelle en français, et de laquelle la morphologie doit s'accommoder pour obtenir un 
output. Nous verrons que les manipulations que les thémes de certaines bases subissent 
en francais (comme dans portugophone) ont pour but de satisfaire différentes contraintes, 
dont les contraintes segmentales, et que les manipulations des thémes sont, en italien, 
beaucoup plus réduites et se limitent à un ou deux types. Pour terminer, cette étude de 
cas me donnera l'occasion de discuter la place de ladite « composition néoclassique » 
dans le systéme morphologique des deux langues en question. 
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Le deuxiéme cas étudié est la construction de noms ou adjectifs au moyen du suffixe 
-ISSIMO en français. La suffixation en -1ssiMo a la particularité de construire des noms 
pour lesquels l'apport sémantique de la construction morphologique est trés faible, dans 
la plupart des cas ils ont simplement une teinte évaluative génériquement appréciative. 
Les bases possibles pour cette dérivation sont donc trés peu contraintes du point de vue 
sémantique (et méme catégoriel). La sélection se fait alors souvent sur une base surtout 
ou uniquement formelle, en utilisant des bases qui sont particuliérement compatibles 
avec les contraintes formelles auxquelles les dérivés en -issiMo sont sujets. Ce procédé 
dérivationnel sera comparé à la construction d'adjectifs et noms en -ISSIME, plus ancienne 
et plus proche aux procédés dérivationnels canoniques du français. 

Les études présentées dans cette section se situent dans une approche extensive à la 
morphologie. Cette approche se fonde sur l'idée que, pour la compréhension des méca- 
nismes qui dirigent la construction du lexique, il est nécessaire d'observer, d'une part, 
une quantité importante de données et, d'autre part, de prendre en compte le lexique non 
établi, non institutionnalisé, et donc - vraisemblablement - construit « sur le champ » 
par les locuteurs. Ce deuxiéme point, en particulier, correspond à deux sources de don- 
nées possibles : soit on s'intéresse, pour les procédés morphologiques canoniques de la 
langue, aux formes non établies, comme les néologismes, les occasionalismes, etc. (c'est 
le cas de la premiére étude proposée), soit on s'intéresse à des procédés morphologiques 
non canoniques (c'est le cas de l'étude suivante). L'idée sous-jacente est que dans le 
lexique établi, y compris parmi les lexémes construits, il y a trop de risques de rencon- 
trer des mots qui ont subi des dérives formelles et/ou sémantiques étrangéres à leur 
mode de construction morphologique, et donc que ce n'est pas le meilleur point d'ob- 
servation pour la compréhension des mécanismes morphologiques tels qu'ils opérent en 
synchronie et « en vrai ». 

Le type d'objets auquel je m'intéresse, bien entendu, n'est pas sans poser de problémes, 
puisqu'il est nécessaire de rassembler des bases de données non attestées suffisamment 
importantes et fiables pour pouvoir tirer des généralisations solides et prédictives. Le 
but de cet article n'est évidemment pas celui de discuter les problémes liés à la mor- 
phologie extensive, qui ont déjà été largement traités en littérature (cf. les travaux cités 
dans la note 3). Ici, quelques remarques sur la collecte et l'exploitation des données sont 
suffisantes : pour tous les phénoménes étudiés j'ai essayé de rassembler des bases de 
données qui, sans étre exhaustives, sont les plus larges possibles. Les données ont été 
recueillies en premier lieu à partir de corpus basés sur le Web, FrWac et ItWac!*. Ces 
bases de données ont été enrichies par des recherches ciblées sur le Web et, occasionnel- 
lement, à partir d'autres sources. Les contextes d'apparition des lexémes inclus dans les 
bases de données ont été vérifiés afin d'éliminer le plus possible le bruit (textes écrits par 
des locuteurs non natifs, fautes de frappe, etc.). Faute de pouvoir réaliser des calculs de 
fréquence fiables, en particulier sur le Web, les analyses présentées ici ne prennent en 
compte que les types de dérivés inclus dans les bases de données et non pas le nombre 
de leurs occurrences (tokens). Bien entendu, des calculs de fréquence des occurrences 


?FrWac comporte ~1,6 milliards de tokens et ~6 millions de types; ItWac comporte -2 milliards de tokens et 
-6,2 millions de types (sur ces deux corpus cf. en particulier Baroni et al. 2009). 
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seraient utiles et intéressants pour confirmer, moduler ou enrichir les analyses propo- 
sées. Toutefois, on peut proposer au moins quatre observations pour justifier le choix 
effectué : i) comme je l'ai indiqué, dans l'étude de la morphologie dérivationnelle l'ob- 
servation des lexémes nouvellement produits (néologismes, occasionalismes, etc.) est 
tout aussi intéressante que celle du lexique établi; or ces lexémes sont généralement trés 
rares y compris dans des corpus de grandes dimensions; si l'on veut privilégier la diver- 
sité des formes produites par les locuteurs, on se retrouve avec des bases de données 
qui comportent un grand nombre de lexémes avec une fréquence d'emploi trés faible 
qui, de ce point de vue, ne permet pas de toute facon de réaliser des calculs statistiques 
fiables; ii) si, comme dans ce travail, on adopte un modèle de la morphologie basé sur 
l'idée d'une interaction de plusieurs contraintes, en dehors du lexique établi la variation 
des outputs des constructions morphologiques est la norme, et la fréquence d'un lexéme 
n'est pas nécessairement corrélée à une plus ou moins grande « régularité » du point de 
vue de la morphologie constructionnelle ; iii) la collecte de bases de données qui, faute 
d'étre exhaustives, sont les plus larges possibles en termes de types permet tout de méme 
de proposer des généralisations et des prédictions sur l'application d'une construction 
morphologique à une base donnée; des études encore plus larges, ou qui prennent en 
compte d'autres paramètres pourront confirmer ou falsifier ces prédictions ; iv) en plus 
d'analyses quantitatives, il est possible de proposer des analyses qualitatives, dans les- 
quelles les propriétés de chaque lexéme dérivé et de chacune de ses variantes éventuelles 
sont attribuées explicitement à la prédominance d'une contrainte (ou d'un ensemble de 
contraintes) ou d'une autre. 


3.1 -PHONE / -FONO 


Pour la premiére étude de cas, j'ai rassemblé une base de données de 475 lexémes (noms 
et/ou adjectifs) désignant, en francais, les locuteurs d'une langue, qui comportent la sé- 
quence finale [fon] précédée, dans la grande majorité des cas, du nom d'une langue. Une 
base de données paralléle, comportant 237 lexémes, a été constituée pour l'italien. Pour 
rassembler la base de données du frangais j'ai repris celle présentée dans Lasserre (2016) 
que j'ai enrichie, initialement, par l'extraction des formes se terminant par les séquences 
«phone» et «phones» dans FrWac, le nettoyage manuel de cette première liste, et ensuite 
par des recherches systématiques sur le Web réalisées à partir des listes de langues (liste 
des langues les plus parlées au monde et liste des langues officielles des pays du monde) 
du Wikipedia francophone. La base de données de l'italien a été constituée à partir de 
ItWac et de recherches systématiques sur le Web en utilisant les mémes ressources que 
pour le francais, ainsi qu'en appliquant -FoNo aux noms des habitants des régions et des 
principales villes italiennes. Les deux bases ont été complétées par des recherches croi- 
sées des correspondants des lexémes présents dans l'une ou dans l'autre. Le fait que la 
base des données du français soit beaucoup plus importante que celle de l'italien (presque 
deux fois plus d'entrées) est certainement dû à la saillance, dans la culture francophone, 
des termes francophone et francophonie. Ces mots désignent deux concepts qui se sont dé- 
veloppés et répandus d'abord en relation à la situation linguistique canadienne (à partir 
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de la fin du XIX* siècle), et ensuite dans le discours politique de l'époque postcoloniale. 
Puisqu'un espace italophone comparable à celui du français pour nombre de locuteurs 
et distribution géographique n'existe pas, italofono, ou des termes similaires, n'ont pas 
la méme connotation, et l'emploi des lexémes en -FONO, en général, est plutôt limité au 
discours spécialisé en linguistique, dialectologie, etc. 

Les lexémes en -PHONE / -FONO sont généralement rangés parmi les composés néo- 
classiques, en vertu de l'origine et de la valeur sémantique supposée du deuxiéme élé- 
ment (issu d'un lexéme nominal du grec signifiant « voix »). Dans ce travail, cependant, 
j adopte un modèle de la morphologie dérivationnelle qui ne prévoit pas de distinction 
discréte entre les différents types de constructions. Les différentes constructions (compo- 
sition, composition néoclassique, affixation) se placent, au contraire, le long d'un conti- 
nuum, avec de différents degrés de grammaticalisation, c'est-à-dire de conventionnalisa- 
tion des propriétés (formelles, catégorielles et sémantiques) des lexémes qu'elles servent 
à former”. Dans ce cadre, aucune différence de nature n'est établie entre les affixes au 
sens traditionnel et les dits « éléments de composition néoclassique » : dans tous les cas 
il s'agit d'exposants de constructions, qui peuvent éventuellement se distinguer pour 
leur degré de grammaticalisation. En aucun cas, on n'attribue d'existence, ni de signifi- 
cation lexicale autonome (contra, par exemple, Corbin 2001) à ces éléments, qui, dans le 
fonctionnement synchronique de la langue, restent indissociables des constructions qui 
les introduisent. En ce qui concerne plus particuliérement les constructions en -PHONE 
/ -FONO en français et en italien, plusieurs propriétés les rapprochent des cas d'affixa- 
tion canoniques. D'une part, les lexémes formés au moyen de ces constructions entrent 
dans des paradigmes dérivationnels avec d'autres lexémes, simples ou construits, par 
exemple, dans le cas des lexémes désignant les locuteurs d'une langue (francophone), 
avec des lexémes à sens collectif en -phonie (francophonie). Deuxiémement, la valeur sé- 
mantique supposée véhiculée par l'élément -PHONE n'est pas toujours saillante lorsque 
ces lexémes sont employés en contexte. Dans certains cas, s'ils sont employés comme 
adjectifs (4a), leur valeur se rapproche de celle des autres adjectifs relationnels construits 
sur des noms; dans d'autres cas (4b), ces mémes lexémes apparaissent dans des construc- 
tions syntaxiques dans lesquelles ils partagent les mémes contextes et les mémes valeurs 
d'adjectifs relationnels (dans ce cas ethniques) canoniques : 


(4) a. Le mot « Rega » serait une transformation rwandophone survenue au XXE 
siécle, au méme titre que le mot « Reka » d'origine ougandophone. 


[https ://www.edilivre.com/frontwidget/preview/book/id/626357/] 


b. Cette « guerre » a aggravé et renforcé les tensions communautaires 
préexistantes entre communautés rwandophones et congolaises d'une région 
peuplée où les litiges fonciers étaient omniprésents... 

[http ://www.revuenouvelle.be/Plus-de-quinze-annees-de-guerre-au-Kivu- 
Ca-suffit] 


Un troisiéme argument qui permet de rapprocher -PHONE / -FONO des affixes cano- 
niques concerne précisément leur comportement phonologique dans les deux langues 


BC. Lasserre & Montermini (2014) pour une discussion détaillée du modèle. 


441 


Fabio Montermini 


et la manipulation des différentes variantes possibles via les contraintes, qui, comme je 
le montrerai dans ce qui suit, ne se différencie pas du comportement d’autres éléments 
dont l'identification comme affixes est plus consensuelle. 

Les noms de locuteurs ne sont pas les seuls lexémes dans lesquels les éléments d'ori- 
gine grecque -PHONE et -FONO interviennent. Pour se limiter, pour l'instant, au fran- 
cais, -PHONE apparait également dans des noms d'instruments (de musique ou autre) 
(xylophone, saxophone) ou d'appareils sonores (audiophone, téléphone) (cf. Lasserre 2016 : 
179-183). Cependant, je considére que ces différents lexémes relévent de constructions 
qui, si leurs exposants sont reliés diachroniquement, sont distinctes. Plusieurs arguments 
peuvent étre avancés pour justifier l'idée que le -PHONE en question est l'exposant d'une 
construction morphologique spécifique distincte des autres qui ont des exposants (par- 
tiellenent) homophones : i) les lexémes dérivés par cet élément présentent une grande 
homogénéité sémantique et catégorielle; concernant ce dernier point, en particulier, ce 
sont toujours des lexémes qui sont à la fois des noms [+humain] et des adjectifs de re- 
lation (qui ne modifient pas nécessairement un nom humain); ii) comme je l'ai montré 
ci-dessus, les lexémes désignant les locuteurs d'une langue appartiennent à des para- 
digmes dérivationnels homogènes et spécifiques, qui différent des paradigmes dériva- 
tionnels des autres types de lexémes. Tous les lexémes en -PHONE peuvent en effet avoir 
un lexéme correspondant en -PHONIE avec un sens collectif (téléphonie, visiophonie), mais 
les dérivés en -iste (téléphoniste, saxophoniste) et en -ique (téléphonique, microphonique) 
sont réservés aux noms d'instruments et appareils, ce qui s'explique par le fait que les 
noms de locuteurs sont déjà à la fois des noms [humain] et des adjectifs de relation. 

Du point de vue des bases sélectionnées par la construction, le cas le plus simple est 
celui dans lequel un lexéme en -PHONE est construit directement sur un nom de langue, 
qui peut désigner uniquement cette derniére (5a), ou bien correspondre à un gentilé 
(5b) ou à un nom ethnique non construit (SCHT. Si la base est un lexéme variable en 
francais, la contrainte de famille est respectée et le théme sélectionné est le plus souvent 
le méme que celui sélectionné par les autres constructions morphologiques, à savoir un 
théme L, qui peut étre identique à un théme B (5d) ou indépendant (5e). La base peut 
étre aussi constituée du théme qui sert également à construire des gentilés, et dans ce 
cas la base est formellement ambigué, puisqu'elle correspond, phonologiquement, au 
nom géographique sur lequel le gentilé est construit (5f). Pour terminer, la base peut 
également étre un radical issu de la modification (généralement une troncation) d'un 
théme (5g), ou un théme supplétif savant (5h). 


(5) créolophone 
espagnolophone 


bascophone 


Rose 


catalanophone 
e. coréanophone 


f. islandophone 


“Sur les noms / adjectifs ethniques et les réseaux lexicaux dans lesquels ils apparaissent, cf. en particulier 
Roché (2008). 
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g. bulgophone, lettophone 


h. magyarophone, sinophone 


Parfois, un dérivé peut étre ambigu et relever à la fois de plusieurs des types ci-dessus; 
italophone, par exemple, pourrait appartenir tant au type (5f) qu'au type (5h). De plus, 
comme je l'ai montré dans la section 2, le méme lexéme de base peut donner lieu à 
plusieurs dérivés différents, relevant de plusieurs types. Pour portugais, par exemple, 
sont présents dans la base de données les dérivés suivants : lusophone (5h), lusitophone 
(5h), portugaisophone (5d), portugalophone (5f, cf. ci-dessous), portugophone (5f). 

Dans la plupart des cas le nom de base correspond à un nom de langue identifiée et 
reconnue, comme dans les exemples en (5). Puisque les taxinomies courantes ne cor- 
respondent pas toujours aux taxinomies scientifiques, cependant, la base peut égale- 
ment correspondre à un nom ethnique désignant un groupe pour lequel on identifie 
une langue ou une variété spécifique (écossophone, marocanophone), à une dénomination 
non officielle (argotique) d'un groupe ethnique (ritalophone, rosbiffophone / rosbiphone), 
à un autre nom d'humains (rebeuophone) ou pas (banlieuophone), pourvu que l'on puisse 
identifier une « langue » (une variété linguistique) spécifique au groupe auquel on fait 
référence. 

Venons-en maintenant aux propriétés formelles de ces dérivés. Du point de vue pro- 
sodique, une contrainte de taille est clairement identifiée, avec 83,57; des lexémes consi- 
dérés (397) qui sont tri- ou quadrisyllabiques (respectivement 142 et 255). La Figure 1 
montre la distribution précise des lexémes dans la base de données selon le nombre de 


syllabes. 


$ CENE =a 
2 syllabes 3 syllabes 4 syllabes 5 syllabes 6 syllabes 


Figure 1 : Distribution des lexèmes en -PHONE selon le nombre de syllabes 


Le fonctionnement de la contrainte de taille montre que, contrairement à ce que l’on 
aurait pu imaginer, le poids de francophone en tant que leader word de la série est limité, 
du moins en ce qui concerne la taille des dérivés. En effet, on aurait pu s’attendre à ce 
que le format trisyllabique prévale, éventuellement au prix de la réduction de bases trop 
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longues. Cependant, si on regarde les lexémes en -PHONE les plus fréquents dans FrWac, 
francophone vient, sans surprise, largement en téte, mais les dix premiers se répartissent 
de manière pratiquement équivalente entre tri- et quadrisyllabiquesP. 

Parmiles 66 dérivés présents dans la base qui comportent cinq syllabes, 28 comportent 
également au moins une variante quadrisyllabique, la plupart du temps obtenue par tron- 
cation du théme de base (type (5g), par exemple arménianophone / arménophone, tibéta- 
nophone / tibétophone). Il en va de méme pour 4 des 9 dérivés qui comportent 6 syllabes 
(américanophone / américophone). Inversement, sur 23 dérivés qui ont un radical obtenu 
par troncation de la base, 22 possédent une variante « longue », généralement com- 
portant une syllabe de plus. De la même manière, sur 91 lexémes relevant du type (5e) 
(emploi du méme théme que celui d'un gentilé), 79 comportent trois ou quatre syllabes. 
Nous pouvons donc considérer que le format tri- ou quadrisyllabique permet de satis- 
faire une contrainte de taille qui veut que, dans un mot construit, la base corresponde 
le plus fréquemment au format dissyllabique (cf. Plénat 2009); les troncations de théme 
ont principalement pour but de satisfaire cette contrainte (au détriment, bien entendu, 
de la contrainte de fidélité base-dérivé). 

Concernant les propriétés segmentales des dérivés en correspondance de l'exposant, 
79,5% des cas (378) se terminent en [»fon] et 20,5% (97) se terminent en [fon] précédé 
d'un autre segment (la plupart du temps une voyelle, cf. ci-dessous). À ce propos, il est 
possible d'établir une corrélation intéressante : pour le second groupe, le segment qui 
précède [fon] est déjà présent dans le thème de base dans la totalité des cas, alors que 
pour le premier groupe, celui se terminant en [ofon], le théme de base ne comporte un 
[o] final que dans 40 dérivés sur 378, répartis comme suit : 


(6) a. thémes se terminant en [o] (espérantophone, lesothophone) 17 
b. thémes tronqués en correspondance d'un [o] (lettophone, tagalophone) 10 


c. thèmes supplétifs savants!é (germanophone, sinophone) 13 


La figure 2 résume la situation décrite (« oui » indique que le segment précédant [fon] 
est présent dans le théme de base, « non » qu'il ne l'est pas). 

Pour 338 lexèmes de la base de données (71,2% du total), donc, l'opération phonolo- 
gique consiste simplement en la concaténation de la séquence [ofon] à un théme, modi- 
fié ou pas; pour 30 autres (les cas (6a) et (6c) ci-dessus), nous pouvons considérer que 
la présence d'un [o] dans la base n'est rien de plus que fortuite. Seuls les 10 lexémes du 
type (6b) manifestent une manipulation dont l'effet est d'avoir un théme se terminant 
par [o]; cependant, dans ce cas, la réduction du thème a aussi pour effet de produire un 
dérivé tri- ou quadrisyllabique dans la totalité des cas. On peut donc considérer qu'ici, 
au mieux, on assiste à une convergence entre la contrainte de taille et la contrainte qui 
demande que le dérivé se termine en [ofon]. 


SLes dix lexémes en question sont : francophone, anglophone, germanophone, arabophone, hispanophone, 
lusophone, néerlandophone, turcophone, berbérophone, russophone. 

16Je considère que les thèmes supplétifs savants comportent un [o] final, dans la mesure où ils peuvent 
apparaitre sous cette forme, par exemple dans des composés (germano-soviétique, sino-japonais). 
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100 
[»fon] [Xfon] 


moui B non 


Figure 2 : Distribution des segments précédant [fon] présents ou non présents 
dans la base 


Considérons maintenant les 97 cas dans lesquels le dérivé ne se termine pas par [ofon]. 
Tout d'abord, plus de deux tiers de ces dérivés (66) possèdent également une variante en 
[ofon]. De plus, comme je l'ai observé, il s'agit toujours de cas comme ceux exemplifiés 
en (7), dans lesquels le segment qui précéde [fon] est toujours déjà présent dans le théme 
de base en tant que segment final. En (7) je donne le détail du nombre de dérivés selon 
la séquence finale : 


(7) afon] aymaraphone 34 
ifon] swahiliphone 28 


efon] malinképhone 9 


wafon] danoiphone 6 
fon] flamanphone 5 
ufon] ourdouphone 4 


BOO moans P 


[ 
[ 
[ 
[Cfon] tamoulphone 8 
[ 
[a 
[ 
[ 


cefon] banlieuphone 3 


On pourrait étre tenté d’identifier les formes [afon] et [ifon] comme des sous-défauts, 
vu leur prépondérance dans cette classe de dérivés. Il est probable, cependant, que leur 
fréquence soit surtout liée à la fréquence globale des noms de langues se terminant par 
[a] ou [i] par rapport aux autres segments. Notons que les 89 dérivés dans lesquels [fon] 
est précédé d'une voyelle différente de [o] constituent la majorité des outputs pour les 
thémes de base se terminant en voyelle. La base de données comprend en effet 58 autres 
dérivés de bases en voyelles, dans lesquels soit la voyelle est effacée en faveur de [ofon] 
(bambarophone), soit, bien plus rarement (uniquement 6 exemples), la séquence [ofon] 
est attachée aprés la voyelle (presque uniquement un [i], thaiophone) (notons, de plus, 
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que dans ce cas il s'agit toujours de bases bréves, susceptibles de donner des dérivés tri- 
ou quadrisyllabiques). 

Une interprétation des données présentées consiste à attribuer à la construction en 
question une forme d'exposant par défaut qui est [ofon], et une variante hiérarchique- 
ment subordonnée, [Vfon] (oà V représente une voyelle quelconque). L'ensemble des 
contraintes formelles (de série) qui pésent sur les outputs de cette construction stipule 
donc qu'un dérivé doit comporter quatre (à défaut trois) syllabes et se terminer en [ofon] 
(à défaut en [Vfon]). Le reste des propriétés formelles observées pour les dérivés en 
question provient des autres contraintes générales qui pésent sur la forme des mots 
construits, et en particulier de la contrainte de fidélité base-dérivé, qui est responsable de 
la forme des lexémes en (7) et, plus en général du timbre de la voyelle qui précéde [fon] 
lorsque ce n'est pas un [o]. À son tour, la contrainte de fidélité interagit avec les autres 
contraintes qui sont responsables pour la sélection et/ou la modification des thémes de 
base, par exemple la contrainte de famille. Si une base est isolée dans sa famille lexicale 
(c'est le cas de la majorité des noms de langues non européennes), alors la sélection du 
théme n'est pas un enjeu : c'est le théme unique qui est choisi et qui est éventuellement 
manipulé pour satisfaire d'autres contraintes. Au contraire, si la base appartient à une 
famille lexicale nombreuse, le théme sélectionné peut correspondre au nom de la langue, 
construit ou pas (coréanophone, corsophone, picardophone, cela correspond, grosso modo, 
àla « Contrainte de fidélité à la forme libre » de Roché & Plénat 2014 : 1873), à un théme 
supplétif savant (francophone, lusophone, magyarophone), ou bien, moins préférentielle- 
ment, au théme qui apparait devant les affixes construisant des gentilés et qui corres- 
pond, dans la plupart des cas, comme je l'ai observé, à un nom géographique de pays, 
région, etc. (islandophone, japonophone). Concernant ce dernier cas, la plupart des dérivés 
sont ambigus, comme ceux mentionnés; cependant, il est possible que, du moins pour 
certains locuteurs, les deux possibilités soient disponibles. Dans certains cas, en effet, 
le théme de base correspond sans ambiguité soit au théme qui précéde un suffixe eth- 
nique (champenophone, néerlandophone) soit à un nom géographique (allemagnophone, 
portugalophone). Ainsi, s'il existe un nom en -PHONE construit sur une base supplétive 
savante, qu'elle soit ambigué (8a) ou pas (8b-c) par rapport à un autre théme, on peut 
rencontrer des variantes qui font prévaloir la fidélité à la forme libre du nom de la langue 
(souvent homophone à un ethnique) et/ou d'un nom de pays : 

(8) a. italophone italianophone 
b. germanophone  allemandophone, allemagnophone 
c. hispanophone  espagnolophone, espagnophone 


Considérons maintenant les données de l'italien. Le premier fait à remarquer est que 
tous les lexémes présents dans le corpus comportent, avant la séquence [fon], un [o] qui 
porte l'accent tonique de mot, et ont donc la structure [X'sfono]!”. L'exposant possède 


1La hauteur des voyelles moyennes n'est pas importante dans ce contexte, puisqu'elle est phonologiquement 
déterminée par la place de l'accent. Pour avoir une représentation phonologique complète, j'indique la 
forme du masculin singulier (finale en -o), mais ce qui suit s'applique à toutes les formes fléchies des 
lexémes en -Fono (finales en -a, - i, -e). 
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donc une forme fixe a la fois plus contrainte et plus longue qu’en frangais. Puisque, 
je le rappelle, je considére qu’un exposant est simplement une séquence phonologique 
associée de facon arbitraire à une construction, on ne doit pas nécessairement chercher 
des raisons qui expliquent la plus grande rigidité de l'italien par rapport au francais dans 
la forme de celui-ci. Il est possible, néanmoins, qu'une des raisons réside dans le fait que 
l'italien tolére moins bien une variation sur une voyelle qui porte l'accent primaire de 
mot, même si l'on peut remarquer que cette voyelle n'est pas toujours [ə] lorsque le 
dérivé n'est pas un nom de locuteurs (teléfono, vibrafono). 

Du point de vue prosodique, on observe une plus grande dispersion des formats pos- 
sibles, avec une prédominance du format pentasyllabique, mais avec presque autant de 
lexémes à quatre ou à six syllabes, comme le montre la figure 3. 


4 syllabes 5 syllabes 6 syllabes 7 syllabes 8 syllabes 


Figure 3 : Distribution des lexémes en -Fono selon le nombre de syllabes 


Comme il a déjà été observé dans d'autres cas, la contrainte de taille est donc moins 
forte en italien qu'en francais, et elle est certainement soumise à la contrainte de fidé- 
lité base-dérivé. Concernant l'interaction entre la base et l'exposant, le cas par défaut 
en italien est celui dans lequel la séquence [»fono] est directement accolée au thème de 
base, si celui-ci se termine en consonne (amazighofono, yiddishofono), ou bien - plus fré- 
quemment - la voyelle finale de la base est effacée (bantofono, ligurofono, quechuofono). 
À eux seuls, ces cas couvrent exactement deux tiers des dérivés de la base (158 sur 237), 
auxquels nous pouvons rajouter 22 cas dans lesquels la base est un théme supplétif d'ori- 
gine savante. 75,97; des dérivés ne posent donc aucun probléme particulier, ni pour le 
choix du théme de base, ni pour l'interaction phonologique entre ce théme et l'exposant. 
Concernant le phénomène d'effacement de la voyelle finale en dérivation en italien’®, 
deux hypothéses sont possibles, dans un cadre de morphologie thématique basée sur les 
contraintes : i) le théme sélectionné est un théme dépourvu de voyelle, le méme que 
l'on retrouve dans d'autres dérivés, qui est sélectionné en respectant la contrainte de 


Cf. Montermini (2010) pour une discussion. 
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famille; ii) le théme sélectionné est un théme qui contient une voyelle (par exemple un 
théme qui coincide formellement avec une des formes fléchies), qui est effacée sous l'ef- 
fet d'autres contraintes, par exemple une contrainte phonologique anti-hiatus. Les deux 
hypothéses en question ne sont pas nécessairement inconciliables. La premiére peut étre 
valable pour les bases qui appartiennent à des familles lexicales nombreuses, alors que 
pour les autres il est plus difficile d'imaginer qu'un théme sans voyelle soit déjà présent 
dans le lexique. De plus, comme j'ai essayé de le montrer dans des travaux précédents 
(Montermini 2003, 2010), l'effacement de voyelle en dérivation est un phénoméne qui, 
au moins en partie, est aussi influencé par la phonologie, avec des voyelles qui sont plus 
facilement effacables que d'autres. Dans la base de données considérée ici on retrouve 
en effet deux exemples de non-effacement de voyelle, bantuofono et urduofono (qui co- 
existent avec les formes plus « réguliéres » bantofono et urdofono). Le fait que dans les 
deux cas la voyelle non effacée est un [u] n'est peut-étre pas un hasard, puisqu'il s'agit 
de la voyelle qui en général résiste plus à l'effacement en italien (cf. les travaux cités 
ci-dessus). 

En ce qui concerne le petit quart de dérivés restants, la quasi-totalité présentent des 
réductions du théme et peuvent étre répartis en deux groupes. Les deux contiennent ma- 
joritairement des lexémes qui sont des variantes d'autres lexémes construits plus « ré- 
guliérement ». Le premier groupe, plus nombreux (45 lexémes), correspond au cas déjà 
relevé pour le français dans lequel un lexéme en -Fono est construit à partir d'un thème 
qui sert aussi de base à des gentilés et/ou à un nom géographique. Comme en français, 
on y retrouve de nombreux cas dans lesquels le théme de la base est ambigu de ce point 
de vue (9a), ainsi que des cas, plus rares, dans lesquels le théme est sans ambiguité soit 
un théme de gentilés (9b), soit un nom géographique (9c) : 


(9) a. islandofono, milanofono 
b. portogofono 


c. polonofono 


Le deuxiéme groupe, plus restreint (5 lexémes au total), comprend des dérivés dans les- 
quels le théme est réduit au format bisyllabique, indépendamment de sa structure mor- 
phologique (albofono, estofono, lettofono). Cette tendance, marginale, à avoir des bases bi- 
syllabiques (et donc des dérivés quadrisyllabiques) doit trés probablement étre attribuée 
à la tendance que présentent les éléments de composition d'origine néoclassique, surtout 
initiaux, à étre bisyllabiques en italien (cf. Thornton 2007 : 253-259). Il est possible que, 
pour certains locuteurs, un nom en -FONO doive encore se conformer au format d'un 
composé néoclassique (peut-étre sur l'exemple des dérivés dans lesquels la base est un 
théme savant). Cependant, vu le nombre de lexémes concernés, il s'agit d'une tendance 
minoritaire, voire résiduelle, ce qui peut étre considéré comme une preuve indirecte du 
fait qu'en synchronie ces formations tendent à étre manipulées par les locuteurs comme 
des dérivés affixaux à part entiére. À la différence du francais, il est difficile d'établir une 
corrélation précise entre ces réductions du théme de base et une quelconque contrainte 
prosodique, puisque, comme nous l'avons vu, les contraintes de taille sont moins impor- 
tantes en italien, et probablement subordonnées aux contraintes de fidélité. 
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Pour conclure sur l'analyse de l'italien, les deux contraintes qui semblent prévaloir 
dans la construction des noms en -FONO sont la contrainte sur la forme des dérivés, qui 
unit, en réalité, plusieurs contraintes segmentales et prosodiques, et qui stipule qu'ils 
doivent avoir la structure [X fono], sans contrainte forte sur le nombre de syllabes, et 
la contrainte de fidélité base-dérivé. Ceci entraine une tendance moins grande qu'en 
francais à modifier les thémes des bases pour satisfaire des contraintes prosodiques ou 
segmentales. 

Ce que la comparaison entre les deux langues montre est que des constructions appa- 
remment similaires, dans le processus de leur intégration aux systémes phonologiques 
et morphologiques des langues en question, peuvent en réalité se développer comme des 
jeux de contraintes agencées de manière différente. L'italien a développé une construc- 
tion dans laquelle la forme de l'exposant est fortement contrainte et la fidélité entre la 
base et le dérivé prime sur les autres contraintes formelles, alors que les contraintes pro- 
sodiques de taille ont moins de poids. En francais, en revanche, ces contraintes jouent un 
róle important, comme dans les autres procédés affixaux, ce qui, combiné à la contrainte 
de fidélité base-dérivé, entraine une diversification des structures segmentales possibles 
pour l'exposant, qui, s'il contient toujours de préférence une voyelle étymologique de 
timbre /o/ à la jonction entre le théme de la base et l'exposant, admet d'autres voyelles, 
voire d'autres segments dans la méme position. 


3.2 -ISSIMO et -ISSIME 


La deuxiéme étude de cas concerne un suffixe du francais qui n'a pas encore suscité, à 
ma connaissance, l'intérét des linguistes et des lexicographes. Il s'agit du suffixe -1ssimo, 
que l'on retrouve notamment dans la construction de noms d'enseignes, événements, 
marques ou produits, les plus connus étant probablement Colissimo et Doctissimo. Ce- 
pendant, on peut également repérer des contextes dans lesquels des lexémes en -ISSIMO 
sont créés et employés en discours par les locuteurs, comme les suivants”? : 


(10) a. Enfin bref je suis tout le contraire de ce qu'il aime c'est ca le plus drolissimo. 
[Twitter, 4 novembre 2013] 
b. J'ai un « torticolissimo ». C'est-à-dire que mon cou est coincé depuis 3 
semaines et que personne ne sait quand la situation sera débloquée. 
[Twitter, 26 mai 2015] 


L'ensemble des lexémes en -ISsIMo cités dans cette section est donné en Annexe, avec une indication de 
leur signification dans les contextes dans lesquels ils ont été repérés. 

2011 est possible que pour ces emplois de lexémes en -1ssIMo en discours les contraintes catégorielles et sé- 
mantiques pésent plus lourd que les contraintes formelles par rapport à ceux qui servent de dénominations 
commerciales, en les rendant, de ce point de vue, plus proches des lexémes en -ISSIME (et des autres lexémes 
construits « canoniques »). Cependant, j'ai recensé trop peu d'exemples de ce type pour pouvoir tirer des 
conclusions fiables. Si cela est vrai, l'ordonnancement des contraintes serait également influencé par des pa- 
ramétres externes à la morphologie liés à l'emploi pragmatique et sociolinguistique des lexémes construits. 
(Je remercie le relecteur de cet article pour m'avoir fait réfléchir sur ce point). 
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c. C'était vraiment énorme! L'entrée de Médine énorme. Daniel Allouche 
(speaker), énorme. Le public havrais, énormissimo (sic). [http://www. 
lebannerofficial.com/index.php?option-com content&task-view&id-355] 


Les lexémes ci-dessus occupent une position canonique de noms ou adjectifs, et véhi- 
culent un sens génériquement appréciatif / superlatif. En ce sens, le suffixe en question 
est proche du suffixe -ISSIME, qui a la méme origine, mais une histoire différente. Les deux 
sont issus du suffixe latin superlatif -issimus. Selon les dictionnaires, -ISSIME est rentré en 
francais via l'italien à partir du XIV* siécle, d'abord via des mots d'adresse comme séré- 
nissime (Perko 2010). En ce qui concerne -1ss1MO, son origine italienne est rendue encore 
plus évidente par la voyelle [o] finale (on peut d'ailleurs considérer qu'il posséde une va- 
riante en [a], par exemple dans Diorissima, Naturissima, etc.). Sans en avoir la certitude, 
je présume que sa disponibilité en francais a été renforcée par l'existence d'un certain 
nombre de mots du vocabulaire musical directement empruntés à l'italien (fortissimo, 
pianissimo, etc.). Les premiéres attestations que j'ai pu documenter remontent à la se- 
conde moitié des années 1960 : Vernissimo apparait dans le slogan d'une annonce de 
vernis pour ongles de 1966, Parfumissimo dans une annonce de savons de 1969 et Ero- 
tissimo est le titre d'un film de 1969. Comme je l'ai montré ci-dessus, le suffixe, d'abord 
employé dans des dénominations, a partiellement pénétré dans la langue courante. Il est 
intéressant de remarquer que, si parfois il est employé dans des contextes spécifiques à 
la réalité italienne (ou plus généralement « latine »), ceci n'est absolument pas systéma- 
tique, comme le montre en particulier la troisiéme attestation de (10). 

Pour cette étude, j'ai rassemblé une base de données de 294 lexémes. Comme dans 
le cas des -PHONE, la base a été recueillie en rassemblant en premier lieu les mots se 
terminant par les séquences <issimo> ou <issima> dans FrWac. Ici aussi, la liste a été 
nettoyée manuellement; de plus, le contexte de chaque forme a été vérifié afin d'élimi- 
ner les nombreux exemples provenant de pages écrites en italien ou en latin ramassées 
par FrWac. Également, tous les mots du vocabulaire musical auxquels j'ai fait allusion 
ci-dessus, ainsi que d'autres qui étaient clairement des emprunts directs (par exemple 
campionissimo) ont été éliminés. Pour terminer, la liste a été complétée par des mots en 
-ISSIMO provenant de différentes sources?! 
rallélement, j'ai rassemblé une liste de 373 lexémes en -ISSIME présents dans FrWac, que 
je compare à ceux en -ISSIMO. 

Concernant tout d'abord ce dernier suffixe, il s'attache principalement à des adjec- 


, et par des recherches ciblées sur le Web. Pa- 


tifs ou des noms pour former des superlatifs??. Du point de vue formel, Plénat (2002) a 
identifié au moins quatre paramétres pour définir son comportement : 


i) il s'agit d'un suffixe « mi-savant », qui peut sélectionner tant des thémes savants 
que des thémes populaires (universalissime vs. naturellissime); 


ii) le suffixe -ique peut être effacé devant -ISsiME (catholissime, nostalgissime) ; 


"Une liste contenant de nombreux mots en -IssIMo m'a été fournie par ma collègue Antonella Capra, que je 
remercie. 

22Dans toute la base on ne trouve qu'un seul lexéme en -ISSIME qui est indubitablement construit sur un mot 
qui n'est ni un nom ni un adjectif : obligatoirementissime. 
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iii) la rime (voyelle et consonne finales) tombe si le théme de base est tri- ou quadri- 
syllabique et se termine par une consonne sifflante latente (bruxellissime, rigouris- 
sime); 


iv) la rime tombe si le théme de base est tri- ou quadrisyllabique et se termine par un 
[i] suivi d’une consonne latente (favorissime, interdissime). 


Les propriétés i) et ii) captent plutót des tendances que des régles. La deuxiéme en par- 
ticulier connait plusieurs exceptions (critiquissime, sympathiquissime), pour lesquelles 
Plénat fait l'hypothése que les lexémes en question ont sélectionné un théme populaire, 
alors que ce sont les thèmes savants qui perdent la séquence [is] devant -ISSIME pour 
respecter la contrainte de dissimilation (catholissime vs. *catholicissime). Les données de 
FrWac semblent indiquer, dans ce cas, que -ISSIME tend plutót à sélectionner des bases 
populaires : sur 23 dérivés construits sur des bases qui possédent un théme L distinct 
des thémes A et B, 17 utilisent le théme A ou B (lamentablissime, sensuelissime, supérieu- 
rissime), et seulement 6 utilisent le théme L (formidabilissime, prétenciosissime). Concer- 
nant les deux paramétres iii) et iv), les données tirées de FrWac potentiellement concer- 
nées sont extrémement rares, mais semblent tout de méme confirmer les hypothéses for- 
mulées par Plénat. Dans l'ensemble de la base de données, on retrouve seulement trois 
lexémes dans lesquels les tendances identifiées ne sont pas vérifiées : andalousissime, pré- 
tenciosissime (iii) et favoritissime (iv). On peut tout de méme observer que, si ces lexémes 
ne respectent pas les contraintes phonologiques (dissimilatives) qui sont à l'origine des 
principes en question, ils respectent entiérement la fidélité base-dérivé. Dans la base, on 
retrouve également quatre lexémes qui correspondent à des cas de surapplication des 
règles ci-dessus, c'est-à-dire des effacements qui ont eu lieu là où on ne les aurait pas 
attendus : Barbérissime, Optalissime (iii), splendissime et sublissime (iv). Les deux derniers 
sont déjà discutés par Plénat ; concernant les deux premiers, il s’agit d'hapax construits, 
respectivement, sur le nom propre Barbéris et sur Optalis, qui est le nom commercial 
d'une série de produits financiers. À propos des cas d'effacement discutés par Plénat, 
cependant, il est intéressant d'observer un autre fait. L'effet des effacements en question 
est que le radical sur lequel le dérivé en -ISSIME est construit est presque toujours iden- 
tique à des thémes de la famille dérivationnelle de la base, qui dans la plupart des cas 
correspondent au théme d'un lexéme autonome. C'est le cas des exemples bruxellissime 
et nostalgissime, et également des dérivés prestigissime et ténébrissime, présents dans Fr- 
Wac. Dans une perspective plus actuelle, les cas en question pourraient probablement 
étre expliqués en termes de sélection de théme plutót qu'en termes d'effacement. Notons 
tout de méme, pour terminer, qu'un effacement a certainement lieu dans plusieurs cas 
lorsque la base se termine par une voyelle, et notamment par [e], cas où, dans les don- 
nées de la base (six concernées au total), il est systématique (branchissime, pavissime —— 
pavé). Globalement, en tout cas, les modifications des thémes des bases restent extréme- 
ment rares dans la base de données. Au total, elles ne concernent que 32 lexémes (moins 
de 10% de la base), distribués comme il suit : 


(11) a. effacement d'une rime voyelle + sifflante : 5 (prestigissime) 


b. effacement d'une rime [i] + consonne : 4 (érudissime) 
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c. effacement du suffixe -ique : 4 (exotissime) 
effacement d’une voyelle finale : 14 (branchissime, pourrissime) 


épenthèse : 5 (absolutissime, merveilleutissime) 


Concernant la structure prosodique des dérivés présents dans la base, la distribution 
est semblable à celle observée pour les dérivés italiens en -FONO, avec une prédominance 
du format quadrisyllabique, mais avec une dispersion des dérivés entre les formats tri-, 
quadri- et pentasyllabique. La distribution des dérivés de la base de données selon le 
nombre des syllabes est résumée dans la figure 4. 
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Figure 4 : Distribution des lexémes en -1sstME selon le nombre de syllabes 


Une telle distribution peut étre corrélée à la rareté des cas de manipulation des thémes 
de base qui a été observée ci-dessus. La concomitance entre ces deux facteurs semble en 
effet suggérer que les contraintes de taille, si elles sont actives, sont subordonnées à la 
contrainte de fidélité base-dérivé : la taille des dérivés dépend alors plus de la taille des 
bases (dont la longueur en syllabes est distribuée de facon aléatoire) que de manipula- 
tions réalisées sur les thémes. 

Penchons-nous à présent sur les lexémes en -issiMo. La première observation que nous 
pouvons formuler à leur égard concerne les propriétés catégorielles et sémantiques du 
suffixe en question. Comme je l'ai observé plus haut, à cóté des cas « canoniques », 
comme ceux exemplifiés en (10), -IssiMo sert souvent à construire des dénominations 
d'enseignes commerciales, événements, marques, produits, etc., ainsi que des occasiona- 
lismes destinés à étre employés dans des slogans. Sa valeur sémantique se limite donc 
dans la plupart des cas à une valeur connotative superlative, voire génériquement po- 
sitive. Les bases potentielles pour ce suffixe sont donc moins contraintes du point de 
vue sémantique, et méme catégoriel; parfois, au contraire, la sélection de la base (ou du 
théme de la base retenu) semble étre faite plutót à partir de sa compatibilité formelle 
avec la construction que de sa compatibilité sémantique. Une premiére conséquence de 
ce fait est que les bases potentielles de -1ssimo sont beaucoup plus variées que celles de 
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-ISSIME, y compris du point de vue catégoriel. La base de données comprend par exemple 
5 lexémes qui sont quasiment sans ambiguité construits sur des verbes (12a), ainsi qu'au 
moins 13 lexémes pour lesquels décider si la base est un verbe ou un nom (plus rarement 
un verbe ou un adjectif) est difficile (12b). 


(12) a. Courissimo, Jonglissimo, Repassimo?? 


b. Agrandissimo, Investissimo, vomissimo 


Si l'on voulait privilégier l'homogénéité catégorielle, on pourrait penser que, parmi les 
mots de (12b), Agrandissimo et Investissimo sont construits sur les noms agrandissement 
et investissement, et vomissimo sur vomi; au contraire, si l'on veut privilégier la transpa- 
rence formelle, on peut imaginer que ces lexémes sont construits, à partir des thémes 
du verbe disponibles, sur celui qui est le plus compatible avec les contraintes imposées 
par la construction (dans ce cas, le Théme 1, celui se terminant en [is] pour les verbes 
du deuxiéme groupe). Dans l'analyse, j'ai choisi d'adopter cette deuxiéme solution, et 
j'ai donc décidé de considérer que les lexémes en question (et les autres semblables) sont 
construits sur un verbe, dont un des thèmes est sélectionné2*. Ce choix semble justifié 
par le fait que, dans d'autres cas, le radical sélectionné pour la dérivation en -1ss1MO 
pourrait correspondre à un des thémes disponibles dans l'espace thématique, choisi soit 
en vertu de sa compatibilité phonologique avec l'exposant, soit d'autres facteurs. Des cas 
comme Linguissimo, optimissimo ou scientissimo ne peuvent, me semble-t-il étre analysés 
que comme ca. 

Une deuxiéme observation concerne la séquence finale de ces dérivés. À la différence 
de -ISSIME, dans les lexémes dérivés par -1ss1Mo la séquence [simo] peut être précédée 
d'une voyelle différente de [i], notamment [a], [e], [o] et [y]. Au total, 24 lexémes de la 
base de données sont concernés : 


(13) Pizzassimo, Prépassimo 


a 
b. Bébéssimo, Cinessimo 


e 


Dodossimo, Vélossimo 


d. Revenussimo 


À ce point, je pense qu'il est clair que la meilleure maniére de rendre compte de cette 
variabilité dans le modéle adopté ici est de l'attribuer à une allomorphie de l'exposant, 
et que le choix de la voyelle dépend d'un segment présent dans la base. Ce point sera 
développé ci-dessous. 

Du point de vue de la sélection du théme de base, mis à part les cas d'incertitude 
mentionnés ci-dessus, -ISSIMO semble se comporter, comme -ISSIME, en suffixe mi-savant, 
méme si les données sont trop rares pour pouvoir tirer des conclusions probantes. Sur 
9 Jexémes construits sur des bases qui comportent un théme L distinct des thémes A et 


23 Repassimo est le nom d'un pressing, et est donc trés vraisemblablement construit sur repasser. 

?^Un cas légèrement plus complexe, mais qui peut recevoir la méme explication, est celui des dérivés 
construits sur le théme 13 d'un verbe (cf. Bonami et al. 2009), par exemple Locatissimo, Nutrissimo, Sé- 
lectissimo. 
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B, 5 utilisent le thème L (par exemple Urbanissimo, Valorissimo) et 4 utilisent un thème 
A homophone du thème autonome (formidablissimo, incroyablissimo) ; 9 autres utilisent 
un théme supplétif d'origine savante (altissimo, Equissimo, Historissimo). 

Du point de vue des modifications que subissent les thémes des bases, quasiment au- 
cun exemple dans la base ne permet de confirmer les observations proposées par Plénat 
(2002) pour -ISSIME (cf. (11)), mis à part 4 dérivés d'un adjectif en -ique où ce dernier 
suffixe est, comme dans les dérivés en -ISSIME, effacé : 


(14) 


Authentissima 


ge 


T 


Erotissimo 


o 


Olympissimo 


Optissimo 


Lorsqu'on compare les bases de données en -ISSIME et en -ISSIMO, cependant, le fait le 
plus frappant est certainement la grande proportion de thémes de bases qui ont subi une 
modification dans cette dernière. Au total, en effet, 124 dérivés en -IssIMo sur 294 (42,1%) 
présentent une modification de la base (presque uniquement des réductions), alors que 
pour les lexémes en -1ss1Mo, je le rappelle, cette proportion était de 10%. En (15) je donne 
le détail des types de modifications rencontrées : 


(15) a. effacement d'une rime voyelle (#[i]) + sifflante : 11 (dégueulassimo, 
Promessimo, Revenussimo) 


b. effacement d'une rime [i] + consonne? : 52 (Apéritissimo, Jurissimo, 
Permissimo, Tennissimo) 


c. effacement d'une voyelle finale : 61 (Bébéssimo, Espérantissimo, Pizzassimo) 


Il est notable, d'ailleurs, que pratiquement toutes les bases qui appartiennent à un des 
types (15a-c) sont réduites. Les quatre seules exceptions sont Blingissimo (qui posséde une 
base monosyllabique), Bijoutissimo (dont le théme est employé par ailleurs, par exemple 
dans bijoutier), Caféissimo et successissimo, qui coexistent, dans la base, avec Caféssimo 
et successimo. On peut aussi remarquer qu'à la différence de ce qui a été observé par 
Plénat pour -ISSIME, la longueur du thème de base ne semble pas avoir une incidence 
particuliére sur ses chances d'étre modifié, puisque peuvent étre réduits des thémes de 
longueur différente, y compris des monosyllabiques (cf. Tassimo «— tasse, nom d'une 
marque de café). 

Chacun des types présentés en (15) mérite d'étre observé dans le détail. Parmi les bases 
dont le théme comporte, en finale, une voyelle différente de [i] et une consonne (latente 
ou pas), un seul (anglissimo) présente un exposant où apparait la voyelle [i]. Dans tous 
les autres, à l'instar de ceux exemplifiés ci-dessus, la voyelle qui précéde [simo] est la 
méme qui apparait dans le théme. Parmi les bases en [i] + consonne, 34 se terminent par 
une sifflante (ou par la séquence [st], comme dans Jurissimo, qui est le nom d'un cabinet 
d'avocats), et 18 se terminent par une autre séquence, presque toujours une consonne. 


25Ce chiffre comprend les 4 bases en -ique mentionnées en (14). 
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Dans quelques cas, cependant, le thème de base est coupé en correspondance d'un [i] 
qui précéde une séquence plus longue qu'une simple consonne : 


(16) a. Apprentissimo 
b. Acquissimo 
c. narcissimo 
d. Numissima 


e. Ravissimo 


Les deux premiers exemples, en particulier, sont intéressants. D'un certain point de 
vue, ils sont paralléles aux exemples de Agrandissimo et Investissimo vus en (12b), puis- 
qu'ils dérivent de deux noms d'action (apprentissage, acquisition), mais la base employée 
dans ces cas ne correspond pas à un des thémes du verbe. En ce qui concerne le type (15c), 
plus de la moitié des thémes en voyelle se terminent par [i], et les autres se distribuent 
comme indiqué dans le tableau 1. 


Tableau 1: Distribution des voyelles finales effacées (voir 15c) 


Voyelle Effectif 


[i] 34 
[o] 12 
[a] 7 
[e]/[e] 7 
ly] 1 


Lorsque la voyelle finale de la base est un [i], l'exposant a évidemment toujours la 
forme [isimo]. Lorsqu'il s'agit d'une voyelle différente, l'exposant a également la forme 
[isimo] dans un tiers des cas (9 sur 27, par exemple Espérantissimo) et une forme ot 
[simo] est précédé par la méme voyelle que celle qui apparait dans la base dans les deux 
tiers restants (Bébéssimo, Pizzassimo). 

Dans le tableau 2, je détaille les chiffres présentés en (15), en donnant la distribution 
des thémes réduits selon la séquence sujette à réduction : 

Que suggère l'ensemble de ces données? En premier lieu, me semble-t-il, il suggère 
que la forme de l'exposant ne posséde pas un segment vocalique fixe comme dans le cas 
de -ISSIME. À l'instar de ce que j'avais proposé pour -PHONE, on peut considérer que l'ex- 
posant de la construction en -IssIMo possède une forme par défaut [isimo] et une forme 
subordonnée [Vsimo], dont l'émergence dépend crucialement de la contrainte de fidé- 
lité base-dérivé. Plus précisément, du point de vue segmental, cette construction impose 
les deux contraintes hiérarchisées [Xisimo] » [Visimo] sur la forme de ses dérivés. Du 
point de vue prosodique, également, nous pouvons observer un comportement partiel- 
lement différent de celui de la construction en -ISSIME, pour laquelle j'ai argumenté que 
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Tableau 2: Distribution des types de séquences finales dans les thèmes réduits 


Finale Effectif 
[i]+sifflante 34 
[i] 34 
Val 27 
[i]+consonne 18 
V#[i]+sifflante 11 


les contraintes de taille jouent un rôle moindre que dans d’autres procédés construction- 
nels en français, et en particulier qu’elles sont subordonnées à la contrainte de fidélité 
base-dérivé. En ce qui concerne -ISSIMO, la distribution des dérivés selon le nombre de 
syllabes est celle donnée dans la figure 5. 
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Figure 5 : Distribution des lexémes en -1ss1Mo selon le nombre de syllabes 


Dans l'interprétation de ces chiffres, il faut considérer que, puisque -1ss1Mo se termine 
par voyelle, un dérivé quadrisyllabique correspond à un dérivé trisyllabique en -ISSIME, 
un pentasyllabique à un quadrisyllabique, etc. En prenant en compte cette différence, 
les deux formats les plus fréquents pour -ISSIME (trois et quatre syllabes) représentent 
79,6% des cas (cf. la figure fig :Montermini :4), alors que pour -1ssimo les deux formats 
les plus fréquents (quatre et cinq syllabes) représentent 91,4% des cas. Il semble donc que 
la contrainte de taille soit plus forte pour -ISSIMo que pour -ISSIME, ce qui expliquerait la 
plus grande tendance de cette construction à modifier les thémes de base sélectionnés en 
les réduisant. Cette tendance que l’on observe pour -IssIMo a cependant, également, une 
autre explication, complémentaire à celle que je viens de proposer. Dans le tableau 3, je 
récapitule le nombre de bases qui subissent une modification (réduction) du théme pour 
-ISSIME et -ISSIMO, en le comparant au nombre de bases totales qui présentent les condi- 
tions pour une telle modification (rime en voyelle + sifflante, rime en [i] + consonne, 
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finale vocalique). Le premier chiffre indique le nombre de bases potentiellement modi- 
fiables, le deuxiéme le nombre de bases qui sont effectivement modifiées : 


Tableau 3: Nombre de bases subissant une modicication 


Bases modifiables Bases modifiées 


-ISSIME 99 (26,5%) 28 (28,3%) 
-ISSIMO 128 (43,5%) 124 (96,9%) 


Ces chiffres nous disent fondamentalement deux choses : premièrement, dans la déri- 
vation en -1ss1MO une base potentiellement modifiable est quasi systématiquement mo- 
difiée ; deuxiémement, dans cette dérivation les bases qui présentent une structure seg- 
mentale compatible avec une réduction (et donc un amalgame avec l'exposant) sont sur- 
représentées par rapport à celle en -ISSIME, pour laquelle nous pouvons considérer que la 
distribution des bases, principalement sélectionnées sur base catégorielle et sémantique, 
est aléatoire du point de vue phonologique?Ó. Cette surreprésentation est justement due 
à la faible sélection qu'opére -1sstMo sur ses bases du point de vue catégoriel et séman- 
tique, ce qui laisse la place pour que la phonologie y joue un róle plus important. Peu 
importe que -1ss1MO constitue, de ce point de vue, une construction non canonique - la 
plupart des affixes, en effet, privilégient les propriétés catégorielles et sémantiques dans 
la sélection de leurs bases. Ce que ces données, et leur interprétation, mettent en lumière, 
en effet, est une des voies que la morphologie peut prendre dans la conventionnalisation 
des propriétés (dans ce cas formelles) qui sont associées à ses constructions. 

Pour conclure, on peut considérer que les contraintes formelles attachées à la construc- 
tion en -ISsIMo sont les suivantes : le dérivé doit avoir la forme [Xisimo] > [XVsimo] (où 
les formes possibles pour l'exposant sont hiérarchisées) ; le dérivé doit comporter quatre 
ou cinq syllabes ou, à défaut, six syllabes ou un nombre supérieur. On peut également 
imaginer que les contraintes catégorielles et sémantiques de sélection de la base sont 
remplacées par des contraintes de sélection formelle que l'on peut formuler et ordonner 
ainsi: 

Une base optimale pour un dérivé en -ISSIMO : 


— est bi- ou trisyllabique; 
— se termine par un [i] ou par une séquence [i]+sifflante ; 


— est plus que trisyllabique et contient une séquence [i]+sifflante à la deuxiéme ou 
à la troisiéme syllabe; 


— se termine par un [i] suivi d'une autre consonne; 


— se termine par une voyelle différente de [i], suivie ou non d'une sifflante. 


261 e nombre de bases potentiellement modifiables est méme surestimé dans le tableau 3 pour -ISSIME, puis- 
qu'ici aucune distinction n'est faite entre les bases bisyllabiques et les bases plus que bisyllabiques qui sont 
les seules, selon Plénat, qui peuvent subir un effacement d'une rime complexe. 
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Comme on le voit, les contraintes qui occupent une place moins élevée dans la hié- 
rarchie manifestent des relachements d'une des propriétés spécifiées par les deux pre- 
mières, soit sur le nombre de syllabes, soit sur le timbre de la voyelle, soit sur la nature 
de la consonne de la rime. 

Pour rappel, j'ai considéré plus haut que les constructions en -ISSIME, de leur côté, 
sont soumises, du point de vue de la sélection des bases, à des contraintes catégorielles 
et sémantiques semblables à celles qui opérent pour les autres constructions affixales 
canoniques. Du point de vue segmental, cette construction spécifie uniquement que le 
dérivé doit avoir la forme [Xisim]; du point de vue prosodique, si des contraintes de 
taille existent, elles sont soumises aux contraintes de fidélité base-dérivé. 


4 Conclusion 


La prise en compte des écarts entre la forme attendue et celle réellement observée des 
lexémes morphologiquement complexes est un des domaines dans lesquels la recherche 
en morphologie, sur le français et sur d'autres langues, a le plus évolué dans les dernières 
décennies. Ceci s'est traduit, d'une part, par la reconnaissance des lexémes comme des 
structures complexes auxquelles peuvent correspondre, synchroniquement, plusieurs 
thémes, des représentations formelles qui sont irréductibles, mais connectées entre elles 
et organisées. Parmi les opérations que la morphologie (dérivationnelle) met en place 
lors de l'application d'une régle (ou construction) morphologique, il y a la définition 
d'un radical, c'est-à-dire la forme à laquelle est appliquée l'opération formelle spéci- 
fiée par la régle. Cette définition passe par la sélection d'un des thémes du lexéme de 
base et par d'éventuelles modifications phonologiques de celui-ci. Une facon de modéli- 
ser cet ensemble d'opération est de considérer qu'elles sont régies par un ensemble de 
contraintes, c'est-à-dire de spécifications des propriétés qu'un lexéme dérivé doit avoir. 
Les contraintes peuvent étre spécifiques à une langue (ou méme à un secteur de la langue) 
ou bien universelles; elles peuvent se renforcer mutuellement, ou bien se contredire, et 
dans ce cas la forme réellement observée pour un dérivé sera déterminée par la tendance 
à satisfaire une contrainte ou une autre, avec des issues potentiellement différentes lors- 
qu'une opération est appliquée à la méme base. Les travaux qui se sont inspirés de ce 
modèle, cependant, se sont principalement intéressés à la variation thématique et aux 
facteurs qui en sont responsables; la variation des exposants des constructions morpho- 
logiques (qui correspond à ce qui traditionnellement était vu comme l'allomorphie af- 
fixale), en revanche, a moins suscité leur intérét. Pourtant, j'ai proposé des arguments 
forts pour soutenir que certains cas de variation formelle que l'on observe en dérivation 
ne peuvent pas étre traités en termes de variation thématique. Il faut donc admettre que 
les exposants des constructions morphologiques peuvent aussi étre sujets à variation, 
une variation qui mérite d'étre prise en compte et, si possible, modélisée. Cela pose, 
tout d'abord, le probléme d'identifier clairement les cas de variation d'un exposant au 
sein de la méme construction des cas de constructions différentes qui, éventuellement, 
peuvent avoir une sémantique proche et des exposants formellement semblables. Pour 
considérer que deux formes sont des variantes du méme exposant, il faut qu'elles soient 
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non seulement proches et si possible liées par des relations phonologiques naturelles, 
mais qu'elles apparaissent dans des lexémes dérivés qui ont des propriétés catégorielles 
et sémantiques semblables (c'est-à-dire qui appartiennent à la méme série), et surtout 
qu'elles soient en distribution complémentaire ou du moins que leurs contextes d'appa- 
rition soient clairement identifiables du point de vue phonologique. Pour modéliser la 
variation des exposants morphologiques, j'ai proposé d'étendre la notion de contrainte 
non seulement à une propriété qui est spécifique à une langue donnée, mais également 
à une construction donnée. Les exposants des constructions morphologiques peuvent 
alors étre vus eux-mémes comme des contraintes, ou des ensembles de contraintes, qui 
interagissent avec les autres contraintes en jeu dans la formation des lexémes complexes. 
Chaque « allomorphe » d'un exposant est donc une contrainte qui, en tant que telle, 
peut étre hiérarchisée par rapport aux autres, ce qui rend compte de l'observation que 
certaines de ces variantes jouent un róle de défaut, alors que d'autres émergent unique- 
ment dans des conditions particuliéres?". 

Afin d'illustrer le modéle que je propose, j'ai réalisé deux études de cas de construc- 
tions morphologiques de naissance ou développement récent. Ce travail se place, en 
effet, dans une approche extensive de la morphologie, dans laquelle est essentielle la 
prise en compte d'un nombre important de données et, si possible, de données qui mani- 
festent la pratique réelle de construction des mots par les locuteurs. C'est pour cela que 
l'observation des nouvelles formations, néologismes, occasionalismes, etc. est tout aussi 
importante, sinon plus, que l'observation du lexique établi. Les deux constructions que 
j ài considérées sont la création de noms de locuteurs en -PHONE à partir du nom d'une 
langue et la création de lexémes avec un sens génériquement appréciatif / superlatif en 
-ISSIMO. La première a la particularité de prendre comme bases aussi bien des noms de 
langues qui appartiennent à des réseaux lexicaux nombreux, et pour lesquels la sélection 
est donc un enjeu, et des noms de langues qui n'entretiennent aucun lien lexical, ou trés 
peu, qui peuvent donc étre sujets à des modifications destinées à en faire de « bons » 
radicaux pour la construction en question. La deuxiéme, à cause de sa valeur pragma- 
tique, définit peu de contraintes catégorielles et sémantiques sur ses bases potentielles, 
qui sont, en revanche, plutót sélectionnées sur une base formelle, selon leur compati- 
bilité avec les contraintes segmentales qui en définissent l'exposant. Chacune de ces 
deux constructions fait également l'objet d'une comparaison. La dérivation en -PHONE 
est comparée à la dérivation correspondante et cognate de lexémes en -Fono en italien; 


27Un relecteur de l'article suggère, en alternative, de considérer qu'une construction peut comporter plu- 
sieurs variantes de l'exposant, dont le choix est déterminé par des contraintes de sélection (un systéme, à 
mon sens, semblable à celui proposé par Bonet et al. 2007 pour le créole haitien et le catalan, qui prévoit, 
pour certains procédés morphologiques, l'existence d'un « catalogue » de variantes hierarchisées). Il est 
vrai que l'efficacité des contraintes a été déjà montrée pour la sélection du théme dans les procédés morpho- 
logiques constructionnels (cf. Plénat & Roché 2014, Boyé & Plénat 2015), et une telle hypothése permettrait 
d'unifier l'analyse des deux. Cependant, il me semble qu'une telle hypothése devrait étre considérée, au 
mieux, comme une variante de l'hypothése principale que je défends, pour au moins deux raisons : i) dans 
certains cas, comme celui de -PHONE, le « catalogue » des exposants correspondrait à une simple liste de 
formes largement redondante (dans le cas en question [fon] précédé de n'importe quelle voyelle et pos- 
siblement de plusieurs consonnes); ii) cette hypothése ne permettrait pas de capter l'interaction entre la 
forme du théme de la base et l'exposant, un élément crucial de l'analyse proposée ici. 
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la dérivation en -IssIMo est comparée à la dérivation, plus canonique, de superlatifs en 
-ISSIME en francais. Ces comparaisons mettent en lumiére le fait que des constructions 
formellement et sémantiquement similaires et qui ont la méme origine peuvent, dans 
des langues différentes ou dans la méme langue à des époques et pour des finalités dif- 
férentes, développer des spécifications phonologiques différentes, ce qui se traduit, dans 
le cadre adopté ici, par des ensembles de contraintes différentes et/ou agencées différem- 
ment. Je prends ce constat pour une démonstration du fait que l'exposant d'une régle 
morphologique correspond simplement à l'association arbitraire entre un ensemble de 
spécifications catégorielles et sémantiques et un ensemble de contraintes formelles. 

Le modéle de morphologie à contraintes ouvre de nombreuses perspectives de re- 
cherche et de connexions potentielles avec des modèles théoriques proches (par exemple 
la Morphologie des Constructions). S'il a été jusqu'à présent appliqué presque unique- 
ment au francais, ce modéle mériterait d'étre testé sur d'autres langues et sur des en- 
sembles de données plus variés. Le travail que j'ai présenté constitue, je l'espére, un 
premier pas dans cette direction. 
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n'est suivi d'aucune indication, cela signifie qu'il a été repéré dans un emploi en discours 
et que sa signification correspond en gros au superlatif du lexéme de base. 
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Tableau 4 : Liste des lexémes en -issimo 


Acquissimo (type de crédit immobilier) 
Agrandissimo (constructeur immobilier) 


Altissimo (magasin d'équipement pour 
escalade) 
Anglissimo 


Apéritissimo (bar à apéritifs) 
Apprentissimo (salon sur 
l'apprentissage) 

Authentissima (magasin de mobilier) 


Bébéssimo (site de produits pour 
enfants) 

Bijoutissimo (site de vente de bijoux) 
Blingissimo (marque de bijoux) 
Caféissimo (huile au café) 


Caféssimo (site de vente de café) 
Cinessimo (nom d'une carte de crédit 
qui comporte des réductions au cinéma) 
Colissimo (service de colis de la Poste) 


Courissimo (compétition de course à 
pied) 

Dégueulassimo 

Diorissima (parfum de la marque Dior) 
Doctissimo (site de médecine) 


Dodossimo (pyjama pour enfants) 
Drolissimo 

Énormissimo 

Equissimo (salon de chevaux) 
Erotissimo (film, 1969) 
Espérantissimo (site d'espéranto) 
Formidablissimo 

Historissimo (librairie) 
Incroyablissimo 

Investissimo (agence immobilière) 
Jonglissimo (festival de jonglerie) 
Jurissimo (cabinet d'avocats) 


Linguissimo (concours de langues) 
Locatissimo (site de location 
d'appartements) 

Narcissimo 


Numissima (entreprise de rachat de 
précieux) 

Naturissima (salon sur la nature) 
Nutrissimo (jeu de société sur la 
nutrition) 

Olympissimo (jeu de société sur les Jeux 
Olympiques) 

Optimissimo 


Optissimo (chaine d'opticiens) 
Parfumissimo (slogan publicitaire, 1969) 
Permissimo (site pour la récupération 
des points du permis de conduire) 
Pizzassimo (sauce tomate pour pizza) 
Prépassimo (école préparatoire) 


Promessimo (type de contrat 
immobilier) 

Ravissimo (machine à découper les 
raviolis) 

Repassimo (pressing) 

Revenussimo (site de conseils financiers) 
Scientissimo (site d'activités 
scientifiques) 

Sélectissimo (club d'affaires) 
Successimo 

Successissimo 

Tassimo (machine à café) 
Tennissimo (centre sportif) 
Torticolissimo 

Urbanissimo (site d'urbanisme) 
Valorissimo (agence immobilière) 
Vélossimo (association cycliste) 
Vernissimo (slogan publicitaire, 1966) 
Vomissimo 
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A frame-semantic approach to polysemy 
in affixation 
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One of the central problems in the semantics of derived words is polysemy. The most ad- 
vanced theory of derivational semantics to date is the Lexical Semantic Framework devel- 
oped by Lieber (2004 et seq.). This theory, however, does not have a straightforward answer 
to the question of which kinds of meaning extensions are possible and which ones should 
be impossible for a given derivative. This is all the more so for deverbal derivation, where 
Lieber explicitly leaves open exactly what the 'semantic body' of verbs, i.e. (roughly) the 
encyclopedic and cultural knowledge involved in interpretation, looks like Lieber (2004: 
72). 


This paper tackles this problem by putting forward a new formal approach to derivational 
semantics, i.e. frame semantics. In frame theory (Barsalou 1992a,b, Lóbner 2013), frames are 
complex structures which model mental representations of concepts. These representations 
are typed, recursive attribute-value structures, where the attributes are functional relations, 
assigning unique values to the concept they describe (see Petersen 2007). Using the appa- 
ratus of this framework, we hypothesize that the semantics of a derivational process is de- 
scribable as its potential to perform certain operations (such as metonymic shifts) on the 
frames of its bases. 


We propose a particular model of affixal polysemy in which attested readings of words of 
a given morphological category result from indexation of particular elements of the frame- 
semantic representation, combined with inheritance mechanisms. For deverbal nominaliza- 
tions in English -ment, the shifts can target (syntactically) argumental and non-argumental 
components. Different bases thus go along with different kinds of semantic shifts in their 
derivatives. Given a particular verb class, possible readings of the respective derivatives are 
predictable. 
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A in affixation. In Olivier Bonami, Gilles Boyé, Georgette Dal, Hélène Giraudo & Fi- 

| ammetta Namer (eds.), The lexeme in descriptive and theoretical morphology, 467—486. 
Berlin: Language Science Press. EE 


Ingo Plag, Marios Andreou & Lea Kawaletz 


1 Introduction 


In many languages polysemy in word-formation is all-pervasive (e.g. Rainer 2014). Fol- 
lowing Bauer et al. (2013), Kawaletz & Plag (2015: 291) list a number of readings of English 
deverbal nominalizations involving the suffixes -ing, -ation, -ment, -ance/-ence, -th and 
conversion, as given in Table 1. 


Table 1: Readings of English nominalizations (Kawaletz & Plag 2015) 


Semantic category 


Paraphrase 


Examples 


Event 
Result 
Product 
Instrument 
Location 
Agent 
Measure 
Path 
Patient 
State 
Instance 


‘the event of V-ing’ 
‘the outcome of V-ing’ 


‘the thing that is created by V-ing’ 


‘the thing that V-s’ 

‘the place of V-ing’ 

‘people or person who V-s’ 
‘how much is V-ed’ 

‘the direction of V-ing’ 


‘the thing affected or moved by V-ing’ 
‘the state of V-ing or being V-ed’ 


‘an instance of V-ing' 


production, training 
acceptance, alteration 
pavement, growth 
seasoning, advertisement 
dump, residence 
administration, cook 
pinch, deceleration 
decline, direction 

catch, acquisition 
alienation, disappointment 
belch, cuddle 


For other languages, similar lists have been produced. For example, for French we find 
the data shown in Table 2 in Fradin (September 7, 2012) (see also, for example, Uth (2011), 
Fradin (2011), Fradin (2012) for French, Rofideutscher & Kamp (2010), Rofsdeutscher (2010) 


for German). 


Table 2: Readings of French nominalizations (Fradin September 7, 2012) 


Semantic category Paraphrase Example Translation 
Event 'action of V-ing' lavage ‘washing’ 
Product ‘resulting object’ construction ‘building’ 
Means ‘what Vs’ emballage ‘wrapping’ 
State ‘fact of being Ved’ embrouillement ‘muddle’ 
Manner ‘manner of V-ing' marche ‘gait’ 
Location ‘place where one V-s’ garage 'garage' 
Group ‘people who V’ équipage ‘crew’ 
Period ‘time during which one V-s' hivernage ‘wintering’ 


These facts raise a number of very general questions. Do affixes have meaning, and if 


so, how can we describe this meaning? Given the variety of interpretations that deriva- 
tives of a given affix can give rise to, this does not seem to be a trivial task. Which kinds 
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of readings or meaning extension are possible and which ones should be impossible for 
a given derivative? How does the meaning of the base interact with the meaning of the 
affix? What are the principles or mechanisms that account for this interaction? In spite 
of the growing number of studies in this domain the answers to these questions are still 
under debate and we are still facing the task of accounting “for the substantial evidence 
that affixes [...] are frequently semantically underspecified, and subject to polysemy and 
meaning extensions of various sorts” (Bauer et al. 2013: 641). 

The crucial question is how the different readings of a given derivative emerge, and, 
as a result, how the different readings of different derivatives of a particular morpholog- 
ical category come about. Some generalizations have been proposed that give at least 
partial answers to these questions. For example, authors like Bauer et al. (2013: 212) have 
claimed that certain base verbs evoke certain readings in the nouns derived from them, 
but systematic studies exploring this claim in more detail and with larger amounts of 
data are rare. Hence, Bauer et al. (2013: 213) only list a few potential generalizations, for 
example that state nominalizations frequently derive from verbs of psychological state, 
and that verbs with inherently spatial denotations give rise to location nominalizations. 

With regard to French, Ferret (2013) and Ferret & Villoing (2015) hold that specific 
readings of derived nouns only arise “if very specific semantic conditions are met by the 
base verb” (Ferret & Villoing 2015: 480). In the case of instrument readings with nouns 
in -oir or -age, this reading can only occur if the base verb denotes an externally caused 
event which involves an instrumental semantic participant. 

What is perhaps noteworthy at this point is the fact that deverbal nominalizations can 
not only lexicalize the event denoted by the verb or the verb’s syntactic arguments, but 
also other entities that are part of the semantic representation of the base verb. For illus- 
tration consider (1). In (1a) we find an eventive interpretation of the converted noun pur- 
chase, while in (1b) there is an object argument reading (‘the thing that was purchased’). 
Similarly, (2a) shows an eventive reading, but, as shown in (2b), also other things can 
be profiled. Thus an embroidery is not the thing that is embroidered (i.e the internal 
argument of the verb), but the entity that results from the activity of embroidering. 


(1) a. [S]earching through the store to find someone to help, I completed my pur- 
chase and then went home feeling dismissed (COCA NEWS 1998) 


b. Outside the store I deposited my purchase in a trash can. (COCA FIC 2008) 


(2 a. Herdaughter Daphne wisely made no comment and pretended to be engrossed 
in her embroidery. (COCA FIC 2000) 
b. [T]he nails of her feet and hands matched the color of the embroidery of her 
leine. (COCA FIC 2010) 


In this paper we will introduce a new approach to the formalization of the interpreta- 
tion of derived words based on frames and apply this approach to the analysis of -ment 
derivatives that are based on change-of-state verbs and psych verbs. 
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2 The framework: Frame semantics 


The approach adopted in the present paper builds on predecessors in cognitive science 
and artificial intelligence such as Marvin Minsky’s (1975) frame theory, the schema the- 
ory of Bartlett (1932), and, specific to linguistics, Fillmore’s work on situation frames 
(Fillmore 1982; see Busse 2012 for a historical overview of the development of frame 
semantics). We use the notion of ‘frame’ in the specific sense of Barsalou (1992a,b), Pe- 
tersen (2007) and Lóbner (2013). In this framework, frames are recursive attribute-value 
structures as known from other frameworks (e.g. HPSG, Pollard & Sag 1994). Frames are 
taken to be a general format of mental representations of concepts which is also appli- 
cable to linguistic phenomena. Frames can be depicted as graphs with nodes and arcs, 
or as attribute-value matrices, as shown for the toy example John hit the ball in Figure 
1, with the graph on the left and the attribute-value matrix on the right. 


hit 


AGENT PATIENT hit 


AGENT 1] John 


John G) o Ball PATIENT [2] Ball 


Figure 1: Two ways of depicting a frame 


© 


In both representations the referential node, which represents the event as a whole, 
is labeled hit (marked by a double circle in the graph), and this hitting event has two 
attributes (which, in this case, stand for the participants), an AGENT attribute with the 
value John and a PATIENT attribute with the value ball. Entities in graphs and matrices 
are often indexed for ease of reference (for example with [0], [1] and [2] as in the attribute- 
value matrix). 

In this approach, attributes are functional in the mathematical sense. The attribute- 
value structures are recursive and they allow for structure sharing (identities of attribute 
values). The values by which an attribute can be specified are subordinate concepts of 
this attribute (Barsalou 1992b: 43). In Petersen’s frame approach, the resulting taxonomy 
is incorporated in the type signature underlying each frame (cf. Petersen 2007: Def. 8 
and Fig. 9). 

Returning to the problem of verbal bases, our formalism can be used to depict the 
semantic representation of specific verb classes. For illustration consider a class that is 
frequently discussed in the literature and that is also a possible base for -ment derivation, 
change-of-state verbs (e.g. Levin 1993, Levin & Rappaport Hovav 1995, Rappaport Hovav 
& Levin 1998, Dowty 1979, Pustejovsky 1991, Van Valin & LaPolla 1997, Alexiadou et al. 
2015). According to many analyses, causation events as expressed by change-of-state 
verbs (such as break) are complex events that consist of two sub-events, a cause and an 
effect. In a frame semantic analysis, causation events can be formalized as in Figure 2. 
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causation event 


ACTOR 1 


UNDERGOER 2 


INSTRUMENT [3 
activity 
ACTOR 1 
CAUSE 4 
UNDERGOER 2 
0 
INSTRUMENT [|3 
change-of-state 
state 
INITIAL STATE |6 
EFFECT 5 PATIENT 2 
state 
RESULT STATE |7 
PATIENT [2 


Figure 2: Change-of-state verbs 


Figure 2 depicts a typical change-of-state verb. The representation is based on estab- 
lished semantic roles (e.g. ACTOR, UNDERGOER) in combination with an event frame. In 
other words, it combines the participants typically associated with such verbs, and em- 
beds them in the event structure assumed for externally caused events. 

A change-of-state verb has three core participants: ACTOR OU), UNDERGOER ((2]) and, 
quite often, an INSTRUMENT ([3]. One of the two sub-events, CAUSE (4) consists of an 
activity with the same three participants. The CAUSE sub-event is typically an activity, but 
could also be any other type of event. The activity has an EFFECT ([5), which constitutes 
the second sub-event, which is a change-of-state. The change-of-state involves an INITIAL 
STATE (ei and a RESULT STATE (7) of a PATIENT. The PATIENT of the two states is the 
UNDERGOER of the event 

Another verb class that is very common as a base for -ment derivatives is that of psych 
verbs. The use of the term ‘psych verb’ is not consistent in the literature, and different 
authors define this class differently. We use the term in this paper as referring to so-called 
‘object experiencer verbs’. These are verbs (such as amuse) where the subject denotes the 
stimulus, and the object denotes the experiencer in an event in which the experiencer 
undergoes a change in its psychological state (see, for example, Levin (1993: 189) for 
discussion). Psych verbs can thus be considered a sub-class of change-of-state verbs, and 
they are also referred to as ‘psych causation’ verbs. A frame-semantic representation of 
such verbs is given in Figure 3. 

The verb has two arguments, a STIMULUS and an EXPERIENCER. Similar to the represen- 
tation of change-of-state verbs there are two sub-events, CAUSE and EFFECT. The CAUSE 
is an activity which has two participants, the ACTOR and the UNDERGOER, and the EFFECT 
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psych causation 


STIMULUS 1 


EXPERIENCER [2 


activity 
CAUSE ACTOR 1 
UNDERGOER [2 
0 
change-of-psych-state 
psych state 
INITIAL STATE [3 
EFFECT EXPERIENCER [2 
psych state 
RESULT STATE [4 
EXPERIENCER |2 


Figure 3: Psych verbs 


is a change-of-psych-state in the EXPERIENCER entity. Note that the frames depicted here 
are only partial, as they omit all information that is not immediately relevant for our 
discussion. 

In the following we will apply the frame-semantic approach to the morphological cat- 
egory of -ment derivatives in English. Kawaletz & Plag (2015) presented already a first 
analysis of psych verbs as bases for -ment derivation. We will extend this analysis to 
other verb classes and propose an account in which attested readings of -ment words re- 
sult from indexation of particular elements of the frame-semantic representation, com- 
bined with inheritance mechanisms. Specific interpretations can target (syntactically) 
argumental and non-argumental components, and, consequently, different types of base 
verb go with different kinds of readings. Given a particular verb class, possible readings 
of the respective derivatives are predictable. As a result, the multiplicity of meaning 
in a particular morphological category can be expressed in an inheritance hierarchy of 
lexeme formation rules. Predecessors of our approach are, for example, Desmets & Villo- 
ing (2009) and Tribout (2010), who also tackle polysemy in word formation by positing 
(slightly different) feature structure representations of lexical semantics in inheritance 
hierarchies. 


3 The suffix -ment: Data collection and attested readings 


3.1 Overview 


The nominalizing suffix -ment derives event nominals of various readings, among which 
Bauer et al. (2013: chapter 10) list events (assessment), results (containment), states (con- 
tentment), products (pavement), instruments (entertainment) and locations (embankment). 
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The suffix was very productive in earlier periods, particularly between the 15th and 17th 
centuries (Marchand 1969, Lindsay & Aronoff 2013), but is still moderately productive in 
present-day English with many “novel or low-frequency words” (Bauer et al. 2013: 199) 
in corpora such as the Corpus of Contemporary American English (COCA) (Davies 2008) 
or the British National Corpus (BNC) (Burnard 1995). The suffix mainly attaches to verbs, 
but adjectival (foolishment) and nominal bases (illusionment) are also attested, as well as 
many bound roots (compartment) (Bauer et al. 2013: 198). 


3.2 Methodology 


For the present study we were interested in new coinages, as these can be taken to best 
reflect the present day speakers’ morphological knowledge. The investigation of old and 
established forms is of course also possible, but such forms are more prone to exhibiting 
idiosyncratic properties resulting from long-term semantic drift or other processes that 
accompany lexicalization. Plag (1999: 119), for example, states that “[t]he advantage of 
dealing primarily with neologisms is that by largely excluding lexicalized formations 
one has a better chance to detect the properties of possible words rather than of actual 
words, which may eventually lead to the correct formulation of the productive word 
formation rule instead of merely stating redundancies among institutionalized words? 

In order to arrive at a sizeable number of forms we first extracted all pertinent ne- 
ologisms of the 20th and 21st centuries from the Oxford English Dictionary (OED). In 
addition, we searched COCA for hapax legomena, i.e. words that occur only once in a 
corpus. Hapax legomena are not necessarily new words, but the proportion of actual 
neologisms is highest among hapax legomena (see, for example, Plag 2003: chapter 3.4 
for discussion). We ended up with 109 deverbal -ment derivatives. We then categorized 
the base verbs according to the verb classes proposed by Levin (1993) (and extended in 
the VerbNet project, Kipper et al. 2008). The verbs come from 29 verb classes, with the 
class of psych verbs being the largest in the data set (N=23). 

In order to investigate possible interpretations of the derivatives, we sampled attesta- 
tions from other corpora (e.g. GloWbE, WebCorp, Google). The attestations were seman- 
tically coded using semantic categories such as STATE, EVENT, EXPERIENCER, STIMULUS, 
RESULT STATE, etc. (see section 3.3. for further discussion). The examples in (3) illustrate 
the EVENT, RESULT STATE and STIMULUS readings. 


(3) a. EVENT Did you put a sound system in your car not specifically for your en- 
joyment but for the perturbment of others within three square miles? (Google 
BLOG 2008) 
b. RESULT STATE I know a lot of our compatriots also feel the same angst, conster- 
nation and confoundment. (GloWbE ART 2012) 


C. STIMULUS Here comes a confoundment (new word I just made up :) ) for you. 
(Google COMM 2006) 


The reader might wonder whether this way of sampling data might favor readings that 
necessarily deviate from the ordinary, the reason for this being that the new formations 
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in -ment may have been coined because a competing nominalization with another suffix 
already expressed a more expectable meaning. Two points are important in this respect. 
First, synonymy blocking has been shown to be an inadequate concept to explain the 
attested distributions of competing affixes. Very often, different affixes appear on the 
same base with no discernible difference in meaning (e.g. Bauer et al. 2013: section 26.4). 
Second, we find the full range of meanings in our data that have also been described 
in the literature on -ment (e.g. Bauer et al. 2013, Marchand 1969). We can thus safely 
assume that our data represent the semantic possibilities contemporary speakers and 
listeners of English have at their disposal when creating, using and interpreting -ment 
nominalizations. 

The crucial question is which interpretations are possible and whether or how these 
interpretations depend on the semantics of the base verb. To answer that question the 
following sections will present an analysis of the attested readings couched in the frame- 
semantic approach sketched above, focusing on two verb classes, i.e. change-of-state 
verbs and psych verbs. 


3.3 Results: attested readings 


Our findings on change-of-state verbs are illustrated in (4). 


(4) a. EVENT 
Markham sets down the rules about park befoulment. (WebCorp BLOG 2012) 


b. INSTRUMENT 
Minimal bleeding and I didn't have to have any guaze/tissue in my mouth at 
all to try and stop it? I'm thinking that they must have used a congealment or 
something to make it clot while I was under or something? (GloWbE COMM 
2010) 


C. CAUSE (activity) or EVENT 
Why do we as Blackpool Fans sit and take this constant bedragglement and 
farce, what is it we are scared of? (Google COMM 2013) 


d. EFFECT (change-of-state) 
For one second she clung to her son, and then, disengaging herself, froze up 
like the sudden congealment of a spring. (Google FIC 2008) 


e. EFFECT (result state) 
Sarcasm, Deb ... trying to excuse the bedragglement of the hair, etc?. (Google 
COMM 2013) 


f. PATIENT (in result state) 
I set down the scrap of doll's dress, a bedragglement of loose lace hem (COCA 
FIC 1999) 


In (4a) we find an EVENT interpretation. This type of derivative is often referred to 
as ‘transpositional’ in the sense that the derived word preserves the sense of the base 
verb and merely recategorizes (‘transposes’) the word from verb to noun (but see Lieber 
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(2015) for a critique of such a view). In (4b), congealment denotes the INSTRUMENT, that 
is, the participant that is manipulated by an ACTOR, and with which an (intentional) act 
is performed.! In (4c), bedragglement is ambiguous between an EVENT ‘transpositional’ 
reading and a CAUSE reading. In the case of a CAUSE reading, bedragglement denotes the 
first subevent, i.e. the causing event, in the complex event, which is most frequently an 
activity. The nominalization congealment in (4d) refers to the second subevent, i.e. the 
change-of-state. Bedragglement in (4e) denotes a RESULT STATE, that is, the state that the 
undergoer is in after or during the event. Finally, in (4f), bedragglement is interpreted as 
the PATIENT in a result state, that is, as the participant that is affected by the event. 

As far as -ment derivatives that are based on psych verbs are concerned, some prelim- 
inary results appeared in Kawaletz & Plag (2015). In the present paper, we build on those 
findings and provide new data. Example 5 lists all readings attested for this class in our 
data. 


(5) a. EVENT 
Did you put a sound system in your car not specifically for your enjoyment 
but for the perturbment of others within three square miles? (Google BLOG 
2008) 


b. STIMULUS 


Here comes a confoundment (new word I just made up :) ) for you. (Google 
COMM 2006) 


C. CAUSE (event) 
I realize that I often awaken in mindless mid-journey getting jarred by a pot- 
hole in the road. That's a quick call-to-action, or perturbment. Mindfulness will 
curb that perturbment and make the journey all the more pleasant and fulfill- 
ing. (WebCorp COMM 2013) 


d. EFFECT (change-of-psych-state), CAUSE (activity) or EVENT 
“[...] that being told, ‘that job is not for you’ is an enraging experience" In her 
own case, Miss Reuben said, the enragement began when a professor told her 
that it really wouldn't matter if she finished her doctoral thesis. (Google MAG 
1972) 


e. EFFECT (result state) 
I know a lot of our compatriots also feel the same angst, consternation and 
confoundment. (GloWbE ART 2012) 


As is the case with -ment on change-of-state verbs, -ment derivatives that are based on 
psych verbs can denote the whole EVENT, giving rise to ‘transpositional’ readings as in 
(5a). In a similar vein, they can denote the first, causing subevent as in (5c) and the state 
that the undergoer is in after or during the event, as in (5e). In addition, -ment deriva- 
tives that are based on psych verbs can denote the stimuLus. This finding shows that 
Pesetsky’s claim is wrong that stimulus or event nominalizations should be impossible 


lIn the present paper, no claim is made with respect to the relation between instruments and means. For 
such a discussion, the interested reader is referred to Fradin (2012). 
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with psych verbs (Pesetsky 1995: 71): “Amusement does not refer to something amus- 
ing something, but to the state of being amused” (see also Kawaletz & Plag (2015) for 
this observation). In (5b), confoundment denotes the participant that elicits an emotional 
or psychological response in the experiencer. Notice that this reading is not evident 
in derivatives that are based on change-of-state verbs. With respect to change-of-psych- 
state readings as in (5d), it should be noted that we have found no unambiguous example 
of a derivative with this particular reading. 

Among our neologisms RESULT STATE is the dominant reading. This is in accordance 
with findings in the literature (e.g. Bauer et al. 2013: 209, Pesetsky 1995). EVENT ‘trans- 
positional’ readings, CAUSE readings, change-of-(psych)-state readings, and RESULT STATE 
readings are attested with both change-of-state verbs and psych verbs. INSTRUMENT and 
PATIENT (in result state) readings are only attested with change-of-state verbs. Finally, 
STIMULUS readings are only available with psych verbs. 


4 Formalization 


In what follows we generalize over the observations we made in the previous section. 
In particular, we give all referential shifts attested per verb class for -ment derivatives in 
the form of attribute-value matrices. 

Figure 4 generalizes over -ment lexemes that are based on change-of-state verbs. The 
frame also contains phonological specifications. 

In order to formalize possible referential shifts, we introduce the attribute REF that 
signals ‘reference’. The value of this attribute determines the reference of the derived 
word. As depicted in Figure 4, the reference (REF) of a lexeme with the phonology x- 
ment, that is based on a change-of-state verb, may be identified with one of the elements 
of the morphological base (M-BASE). In more detail, the value of REF is [0] in the case of 
EVENT ‘transpositional’ readings, [3] when the derived word denotes the INSTRUMENT, 
4] in CAUSE readings, [5] in change-of-state readings, [7] in RESULT STATE readings, and, 
finally, 27] when the derivative denotes the PATIENT in RESULT STATE.” 


“Tt is not an easy task to formally define a referent that is in a particular state (of more than one possible 
states) in the course of a dynamic event, here to a PATIENT in RESULT STATE in a change-of-state event. The 
difficulty arises from the fact that dynamic elements would need to be incorporated into the — essentially 
static — attribute value matrix. There have been several attempts to solve this vexed issue, and the interested 
reader is referred to these proposals (Gamerschlag et al. 2014, Lóbner 2017, Osswald submitted). Future 
work will have to show how a technical definition of PATIENT in RESULT STATE can be included in the 
frames we propose in this paper. 
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lexeme 
PHON x|-ment 
PHON [x 
causation event 
ACTOR 1 
UNDERGOER |2 
INSTRUMENT |3 
activity 
ACTOR 1 
CAUSE 4 
M-BASE UNDERGOER  |2 
SEM 0 
INSTRUMENT [3 
change-of-state 
state 
INITIAL STATE [6 
EFFECT 5 PATIENT 2 
state 
RESULT STATE [7 
PATIENT [2 
REF = 0j, 13), 4), 5), LZ, (2:27 


Figure 4: -ment on change-of-state verbs 


In a similar vein, Figure 5 gives all possible referential shifts attested in -ment deriva- 
tives that are based on psych verbs. 

Based on this figure, the reference (REF) of a lexeme with the phonology x}-ment may 
have the value [0], [1], 3] [4], or [6]. Thus, it may refer to one of the elements of the verbal 
base: [0] accounts for EVENT ‘transpositional’ readings, -ment derivatives with value [1 
refer to the STIMULUS, [3] captures CAUSE readings, [4| accounts for change-of-psych-state 
readings, and -ment derivatives with REF [6] have a RESULT STATE reading. 

Although Figures 4 and 5 show the range of values available for the reference of -ment 
derivatives per verb class, they collapse all possible readings under REF. In other words, 
REF = { [0], [4], [3], [4] [6] } and rer = { [0], [3] [4] [5] [7], [2H 7] ) state all possible readings for 
-ment derivatives based on psych state verbs and change-of-state verbs respectively, but 
do not address the mechanisms by which these readings arise. In addition, these figures 
establish no link between shared readings among the two verb classes. We will deal with 
these issues in the following section. 


a 


N 
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lexeme 
PHON x|-ment 
PHON [x 
psych causation 
STIMULUS 1 
EXPERIENCER [2 
activity 
CAUSE 3|| ACTOR 1 
M-BASE UNDERGOER [2 
SEM 0 
change-of-psych-state 
psych state 
INITIAL STATE [5 
EFFECT [4 EXPERIENCER [2 
psych state 
RESULT STATE [6 
EXPERIENCER [2 
REF = 0], [1], [3], [4] [6 


Figure 5: -ment on psych verbs 


5 Accounting for polysemy 


There are two approaches to multiplicity of meaning in derivation: monosemy and pol- 
ysemy. We will first discuss the monosemy approach. 


5.1 A monosemy approach to multiplicity of meaning 


In the monosemy approach, multiplicity of meaning is reduced by assigning an under- 
specified meaning to an affix. More specific meanings of affixes derive from a general 
highly underspecified meaning. This is done by means of semantic extension rules and 
interaction between the semantics of the base and the affix. Concrete meanings of de- 
rived formations can also be attributed to contextual and encyclopedic information. 
The monosemy approach figures prominently in a number of works on deverbal for- 
mations. Consider for example the discussion of -er nominalizations (for Dutch see Booij 
1986 and for English Rappaport Hovav & Levin 1992, Plag 2003). A closer inspection of 
the analysis put forward by Plag (2003) illustrates the monosemy approach. According to 
Plag (2003: 89), -er derivatives often denote active or volitional participants in an event 
(e.g. singer, writer). Plag also mentions that -er is used to derive instrument nouns (e.g. 
blender, mixer), to denote entities associated with an activity (e.g. diner, toaster), and to 
derive person nouns indicating place of origin or residence (e.g. Londoner, New Yorker). 
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The multiplicity of meaning evident in -er affixation leads Plag to propose that “the se- 
mantics of -er should be described as rather underspecified, simply meaning something 
like ‘person or thing having to do with X? The more specific interpretations of individual 
formations would then follow from an interaction of the meanings of base and suffix and 
further inferences on the basis of world knowledge.” (Plag 2003: 89) 

Let us now apply the monosemy approach to -ment derivatives. In order to do so we 
have to reduce multiplicity of meaning by identifying meanings that are shared by all 
-ment derivatives. The results in section 3.3 suggest that -ment forms denote (a) eventu- 
alities (see 4a), and (b) entities (see 4f). Thus, the abstract core meaning of -ment seems 
to be ‘eventuality or entity having to do with X". 

The disjunction 'eventuality or entity' illustrates the first problem that monosemy 
approaches are confronted with. In particular, the aim of monosemy approaches is to 
reduce multiplicity of meaning by postulating a unitary abstract meaning. But how ab- 
stract should this meaning be? In the case of -er, one could claim that -er derivatives 
denote ‘an entity having to do with X’. This qualifies as a unitary meaning since all -er 
derivatives do denote an entity. Derivatives in -ment, however, do not always denote 
an entity. They may be eventualities as well. Thus, we have to resort to the disjunc- 
tion 'eventuality or entity' to capture the semantics of -ment derivatives. This, however, 
shows that the desirable underspecified meaning cannot always be sensibly reduced to 
a single unitary meaning. 

The second problem with the monosemy approach is overgeneration. Let us assume 
that the semantics of -ment derivatives could be reduced to the underspecified meaning 
‘eventuality or entity having to do with X’. What kind of predictions would follow from 
this meaning with respect to (a) already attested meanings and (b) meanings that are 
excluded? Although the meaning 'eventuality or entity having to do with X' is abstract 
enough to tackle all attested readings of -ment derivatives, it leads one to expect that 
-ment derivatives could in principle denote all ‘entities’. This is not verified by data, how- 
ever, since agentive readings are never part of the heterogeneous meanings of -ment. 
Thus, we have to conclude that the monosemy approach does not fare well with respect 
to which meanings are possible and which meanings are not possible, simply because it 
leads to massive overgeneration. 


5.2 Polysemy in Frame Semantics 


In this section we propose that polysemy in derivation should be treated as multiplicity 
of meaning in word formation patterns. As we will show, given the architecture of frame 
semantics, this multiplicity of meaning can be expressed in an inheritance hierarchy of 
lexeme formation rules. 

Like some previous authors working on polysemy in word-formation (e.g. Desmets & 
Villoing 2009, Tribout 2010), we assume that attributes and their values are given in a 
type signature which can be considered as an ontology which covers world knowledge. 
According to Petersen & Gamerschlag (2014: 203-204), a type signature restricts the set 
of admissible frames, includes a hierarchy of the set of types, and states appropriateness 
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conditions. These conditions declare the set of all admissible attributes for a lexeme of 
a certain type and the values these attributes take. Appropriateness conditions are in- 
herited by subtypes (see also Riehemann 1998, Koenig 1999, Bonami & Crysmann 2016, 
Andreou & Petitjean 2017). Consider, for example, the type signature in Figure 6: 


T 


physical object taste color shape =. 


COLOR color 
SHAPE shape 
pet ELE sour sweet red green blue round angular 
fruit dice 
TASTE laste SHAPE angular 


| 
apple 


SHAPE round 


Figure 6: Example type signature (adapted from Petersen & Gamerschlag 2014: 204) 


In this type signature, subtypes are given below supertypes. For example, apple is a 
fruit, which is itself a physical object. The node physical object meets two ACs, that is, it 
is characterized by the attributes COLOR and sHaPE that have the values color, red, green, 
blue and shape, round, angular, respectively. According to the ACs on physical object, 
TASTE does not attach to nodes of this type. Thus, not all physical objects have a taste. 
Given that ACs are inherited and further specified by subtypes, apple inherits the ACs on 
fruit and physical object. Thus, apple is characterized by the attributes TASTE, COLOR, and 
SHAPE. The value of SHAPE is round since subtypes not only inherit attributes from their 
supertypes, but also specify and further restrict the value of inherited attributes. In a 
similar vein, dice inherits the attribute sHAPE from the node physical object and specifies 
the value of SHAPE as angular. 

The careful reader may have noticed that color in Figure 6 is used as an attribute label 
(i.e. COLOR) and as a type label (i.e. color). In frames, this redundancy is attributed to 
the ontological status of attribute concepts. These functional concepts can be interpreted 
both denotationally and relationally (Guarino 1992). Thus, the denotational interpretation 
of color covers the set of all colors (i.e. type label color) and the relational interpretation 
covers the use of color as a functional attribute that assigns a particular color (e.g. red) to 
the referent of the frame (for more on the use of functional attributes see Lóbner 2015). 

In the spirit of previous analyses (Riehemann 1998, Koenig 1999, Booij 2010, Bonami & 
Crysmann 2016) we assume that lexeme formation rules are also organized in an inher- 
itance hierarchy. In particular, consider the following inheritance hierarchy of lexeme 
formation rules (‘lfr’) for deverbal nouns (‘v-n’) derived by -ment. 

Figure 7 gives a partial hierarchy of the referential shifts attested in -ment affixation. It 
is only partial for two reasons. First, we do not model the use of -ment on adjectives (e.g. 
foolishment) and on nominal bases (e.g. illusionment). Second, due to space limitations 
we model only three possible readings of -ment derivatives, namely, event-nouns (evt-n), 
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stimulus-nouns (stim-n), and result-state-nouns (r-st-n). The three dots on the right-hand 
side show that there are other readings which we do not model here. 

The information on the left hand side provides the phonology (PHON) of -ment deriva- 
tives. That is, x-ment formations have the phonology (1]-/ment/, where the boxed [1] is 
the phonology of the base (i.e. M-BASE). The possible readings are given on the right hand 
side of this figure under sEM (i.e. SEMANTICS). 

In more detail, in event-nouns (evt-n), the event argument (EvT) of the morphologi- 
cal base is identified with the referential argument (REF) of the derivative. This category 
includes all -ment derivatives in which a transpositional reading is attested. As shown 
in Figure 7, the category of event nouns includes enrapturement and confoundment that 
are based on psych causation verbs, congealment and bedragglement that are based on 
change-of-state verbs, and addressment that is based on a verb of yet another class, illus- 
trate verbs. 

In the case of stimulus-nouns (stl-n), the reference of the noun is identified with the 
stimulus argument (STL) of the base. This category includes -ment derivatives based on 
psych causation verbs only (e.g. enrapturement, confoundment). -ment derivatives based 
on change-of-state verbs (e.g. congealment) are not included in this category since a 
stimulus argument is incompatible with change-of-state verbs (see the frame for change- 
of-state verbs in Figure 2). 

In the case of result-state-nouns (r-st-n), the reference of the noun is identified with the 
result state argument (RESULT STATE) of the morphological base. This category includes 
derivatives based on both psych causation verbs (e.g. confoundment) and change-of-state 
verbs (e.g. bedragglement). 


6 Conclusion 


In this paper we have advocated a new approach to the formalization of polysemous 
derivational categories, based on frames as represented in attribute-value structures. The 
approach was illustrated with recent English neologisms derived with the suffix -ment, 
which we have shown to exhibit a wide range of possible readings. 

We have argued against an approach that assumes a highly underspecified meaning of 
-ment and in favor of an analysis that assumes hierarchically structured lexical rules and 
inheritance mechanisms. The proposed analysis has three main characteristics. First, it 
links the shared readings that are attested among the various verb classes. In the case of 
event-nouns, for example, we need not pose different rules per verb class since all -ment 
derivatives that are based on change-of-state verbs and psych verbs can inherit the evt-n 
reading. Second, certain readings are excluded by means of appropriateness conditions 
that give rise to incompatibility. For instance, linking -ment derivatives that are based 
on change-of-state verbs to stimulus readings fails because the stimulus argument is 
incompatible with change-of-state verbs. Thus, inheritance fails. These characteristics 
allow us to deal with derivational polysemy without having to resort to underspecified 
meanings. Finally, the use of appropriateness conditions that give rise to incompatibility 
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is an effective step to tackle overgeneration, which is a major problem for monosemy 
approaches to meaning. 

As a next step in our research agenda, the approach will have to be applied to more 
verb classes that take -ment, and to other affixes. 
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Chapter 19 


Word formation in LFG-based layered 
morphology and two-level semantics 


Christoph Schwarze 


This article treats the problem of how the semantics of word formation can be accounted for 
in terms of rules and representations. A comprehensive model of multilayered, LFG-based 
morphology is proposed. It comprises four layers of representation: phonology, constituent 
structure, functional feature structure and lexical semantics. The meaning of derived words 
is treated in the framework of two-level semantics. It is assumed that rules of word forma- 
tion derive underspecified semantic forms, parting from which the actual meanings are con- 
strued by recourse to conceptual structure. The model is illustrated on the basis of three mor- 
phological processes: French é-prefixation, Italian denominal verbs of removal, and noun-to- 
verb conversion in French. The analyses of é-prefixation and of verbs of removal are taken 
from the literature; the study on noun-to-verb conversion is original work. 


1 Introduction 


The hypothesis that the semantics of word formation is an aspect of grammar assumes 
that the processes of word formation concern both form and meaning. However, actual 
work on this basis encounters considerable challenges. The data available for the study 
of a given process of word formation never seem to show a perfect parallelism between 
form and meaning: forms that stem from a given generative process often have meanings 
on which it seems to be impossible to form a descriptive generalization. It is the aim of 
this paper to show how challenges to the semantics of word formation can be dealt with. 

I will first address the question of how morphological processes and structures can 
comprehensively be represented. I will then present three hypotheses concerning the 
semantics of word formation, namely 


i. The semantic output of morphological rules is underspecified. 
ii. The meanings of the words that constitute the data arise from a sequence of 
steps. 


Christoph Schwarze. Word formation in LFG-based layered morphology and two- 
level semantics. In Olivier Bonami, Gilles Boyé, Georgette Dal, Hélène Giraudo & 
| Fiammetta Namer (eds.), The lexeme in descriptive and theoretical morphology, 487- 
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iii. These steps are a. the morphological rule defines an underspecified semantic 
form, b. semantic form is turned into a specified meaning on the basis of con- 
ceptual knowledge, c. the derived word enters the lexicon, and d. the lexical- 
ized word may have its own development, independently from morphology, 
and its original meaning may thus be changed and its morphological origin 
be obscured. 


2 LFG-based Layered Morphology 


LFG-based Layered Morphology (LLM) integrates essential properties of Construction 
Grammar Morphology,! which does justice to the multi-layered nature of the lexicon, 
and HPSG-based morphology, which elaborates on the features that syntax receives from 
morphology.” 

Notice that LLM is a model, not a theory or a hypothesis. Unlike theories and hypothe- 
ses, which can be empirically evaluated with reference to observable data, models can 
only be evaluated with respect to their usefulness for the progression of knowledge. This 
kind of usefulness cannot be measured, it can only be shown by actual work on specific 
phenomena. That is what I will try to do in this study. 

Lexicalist models of grammar commonly assume that words are linguistic objects with 
layered representations, phonological, syntactic and semantic. Accordingly, morpholog- 
ical processes operate simultaneously at various layers or levels of representation.? In 
accordance with Lexical Functional Grammar (LFG) the LLM model makes a distinction 
between the level of constituents, called the c-structure level, and a level of functional 
features, called the f-structure level.* The latter contains features concerning agreement, 
tense, mood, inflectional class etc. It also contains grammatical functions and, impor- 
tantly, predicate features, which are labels of lexical meanings and encode grammatical 
functions, the syntactic reflex of argument structure. 

In addition to these two “syntactic” levels, morphological representations need to com- 
prise a phonological level to account for non-concatenative morphological processes, 
like German Umlaut; cf. Germ. krank /krank/ ‘sick’ + -lich Aty/ ‘ly’ — kränklich /kren- 
klry/ ‘sickly’. 

And, of course, there is a semantic level, where the lexical meanings encoded in the 
lexicon are represented and processed. Resuming, morphological representations and 
processes are located at 


* The level of constituent structure (the c-structure level) 


1See Booij (2010), Booij & Audring (2017). 

?For work on French, see Fradin (2005) and Tribout (2010b). 

3] fully agree with Aurnague & Plénat (2008: 1)when they say: “Une lexie est un n-tuplet de représentations 
reliées entre elles, mais relevant chacune d'un niveau linguistique (phonologique, syntaxique, sémantique, 
etc.) distinct. La description d'un mode de formation lexical productif suppose par conséquent que soient 
relevées et expliquées les régularités apparaissant à chacun de ces niveaux." 

^LLM was first presented in a seminar held by the author at the University of Padova in 2008 and subsequently 
applied to the formation of Italian past and passive participles in Schwarze (2011). 
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e The functional level (the f-structure level) 
* The phonological level (the p-structure level) 


* The semantic level (the s-structure level) 


Unlike syntax, morphology may manipulate predicates, thus deriving new predicates. 


3 A sample analysis: French é-prefixation 


I will illustrate LLM on the basis of Namer & Jacquey’s (2003)? article on the French é- 
prefixation, which endevours to give a formalized version of the findings of Aurnague & 
Plénat (1997). In one of its modalities, é-prefixation turns nouns into transitive verbs that 
denote events where the referent of the base noun is distanced, removed, or separated 
from the referent of the direct object, as in (1): 


(1) FR. 
a. é+ branche — ébrancher (un arbre) 
‘branch’ ‘to prune a tree’ 
b. é+ feuille — effeuiller (x) 
"leaf" ‘to strip the leaves or petals from x’ 


c. é+ gorge — égorger (x) 
‘throat’ — 'to cut x's throat’ 

d. é+ pou — épouiller (x) 
‘louse’ ‘to delouse x’ 


Moreover, as has been shown by Aurnague & Plénat (1997, 2007, 2008), the relation 
that holds between the two dissociated entities must be “usual” and “natural”, or, as 
Namer & Jacquey put it: 


[D]escribing the process consisting in clearing a tree of e.g. the magpies (pie) or 
the cats (chat) that colonize it cannot be performed by processes referred to by the 
?épier? or ?échatter impossible derived verbs. (Namer & Jacquey 2003: 2) 


Table 1 gives the rule that generates verbs like ébrancher, effeuiller, égorger or épouiller 
in the LLM notation. 

The c-structure change as formulated in Table 1 should be self-explanatory, whereas 
a few comments on the f- and s-structure part of the rule will be useful. 


>In a subsequent article, Namer & Jacquey (2012) proposed a modelization of the N>V vs. V>N derivations 
within the framework of the Generative Lexicon. 

Changes like adding /j/ to the stem as in épouiller are idiosyncratic and must be accounted for in the lexicon. 

7*L..] les dérivés en é- expriment la dissociation [...] par un agent intentionnel [...] d'une relation 
d'attachement habituel [...] créée naturellement [...] et à laquelle il s'oppose [...]” (Aurnague & Plénat 
2008: 28). 


5Not to be confounded with existing épier qu. ‘to spy on someone’. 
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Table 1: The tım rule for é-prefixation 


c-structure [lv prefix + Nstem — [éNstemlv. stem 
f-structure (T PRED1)=‘ P’ — (f PnED2)- pissociArE (Î suBJ),(Î OBJ)’ 
p-structure «no morphologically relevant change - 
s-structure Ay p(y) — Ax Ay az 
dissociate(x, y, z)Anatural_relationship(y, z) ^ agent(x) ^ theme(y) 


PRED is a feature attribute, whose value identifies a word's lexical meaning and argu- 
ment structure. The input, (T PRED1)="P’, contains a predicate variable, p, which ranges 
over the nominal predicates associated with constituent Ngtem. The up-arrow is an ab- 
breviation for a function that projects the feature to the dominant c-structure node. The 
output of the semantic change is a new predicate PRED2, which is defined by the rule. It 
has two arguments, an agent and a theme, realized as the subject and the object respec- 
tively. Notice that the prefix, in accordance with Namer & Jacquey (2003), has no direct 
functional representation, because it has no referential meaning.? As to the s-structure 
level, the derived predicate, ‘dissociate’, has three semantic arguments: x, which is the 
subject and refers to the agent, y, which is the object and refers to the theme, and z, 
which is incorporated in the verb’s meaning and refers to the entity which is dissociated 
from y . The additional predication, repeated as (2), is needed to constrain the range of 
y and z: 


(2) natural relationship(y, z) 


This part of the representation expresses the fact that the relation between y and z must 
not be a merely accidental one, as reported above. 

Notice that the change in s-structure as expressed in Table 1 does not predict the 
full actual meanings of the verbs derived by é-prefixation: the derived representation 
is underspecified.'? In the following section I will give some background for such an 
assumption. 


°“Our purpose is ... to represent the verb class obtained by the é-prefix derivation. To achieve this, two basic 
ways are provided: (i) representing the prefix itself or (ii) representing an abstract, parametrized lexical 
unit describing the output (verbal) class. The motivation for the first choice would be the fact that the 
affix can be seen as some kind of predicate, operating on and controlling two arguments, the base and 
the derived word, from a structural, categorial and above all semantic points of view. However, the nature 
itself of the affix is a counterargument: according to the morphological theory defended here, an affix does 
not belong to any of the major categories. In addition, we have seen that it bears no referential meaning: 
consequently, it is not foreseeable to modelize its semantic content, as it has no proper semantic content" 
(Namer & Jacquey 2003). I follow this argumentation, with the exception that not belonging to a major 
category does not generally imply the lack of functional or semantic information. 

10This assumption is quite common in the literature, see the survey in Tribout (2010b: 282-284). 
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4 Two-level semantics 


If we assume that lexical morphology is a generative subsystem that feeds the lexicon, 
then its semantics is part of lexical semantics. Now, the fundamental question is to which 
extent lexical semantics is an affair of grammar. According to the conception known as 
two-level-semantics, lexical meaning is represented at two distinct levels, semantic form 
(sr) and conceptual structure (cs).!! 

Semantic form is linguistic knowledge. srs "are systematically connected to, and hence 
covered by, lexical items and their combinatorial potential to form more complex ex- 
pressions" (Lang & Maienborn 2011: 711). They "form an integral part of the information 
cluster represented by the lexical entries of a given language" (Lang & Maienborn 2011: 
711). They are “accessibly stored in long-term memory" (ib.). They are underspecified 
with respect to cs representations (Lang & Maienborn 2011: 713). And, importantly, sF 
is the level at which two-level semantics endeavors to represent the compositionality of 
lexical meaning and the grammatical role of lexical decomposition (Lang & Maienborn 
2011: 723). 

Returning to the semantics of word formation, it is an aspect of grammar, as far as se- 
mantic form is concerned. Most of the characteristics of sr that hold for ordinary lexical 
semantics also hold for the semantics of word formation, with one exception: composi- 
tionality is not a general feature of lexical morphology. In fact, non-concatenative pro- 
cesses may be absolutely regular, but cannot be compositional, since compositionality 
presupposes concatenation. 

In order to see whether the semantics of lexical morphology can reach out to phenom- 
ena that are situated beyond sr, let us see what two-level semantics means by conceptual 
structure. 

Conceptual structure can be said to be world knowledge (Lang & Maienborn 2011: 
711). That does not mean, however, that it has nothing to do with language, actually, it 
is closely related to sr: cs representations are built upon and enrich sr representations. 
Thus, semantic representations typically contain both, cs and sr features. This happens in 
such a way that, for the representation of a given lexical meaning, the cs features specify 
and enrich sr representations, thus enabling words to denote their referents.!? 13-14 


"For a critical state-of-the-art overview, see Lang & Maienborn (2011). 

?In Lang and Maienborn's words: “...for every linguistic expression e in language L there is a cs representa- 
tion c assignable to it via sr(e), but not vice versa" (Lang & Maienborn 2011: 711); *... cs representations are 
taken to belong to, or at least to be rooted in, the non-linguistic mental systems based on which linguistic 
expressions are interpreted and related to their denotations." 

P'This conception has an important consequence: if the features retrieved from cs are combined with or 
replace sr features, doing lexical semantics does not mean to represent the entire bulk of knowledge and 
beliefs that we have about the referents of the lexemes under investigation. 

14 As to the mental status and processing of cs representations, they are assumed to be “activated and com- 
piled in working memory”, contrarily to SF representations, which, as has been said above, are stored in 
long-term memory (Lang & Maienborn 2011: 712). I am not sure about the mental status of cs: it may safely 
be assumed that concepts, once they are lexicalized as meaning components, are as stable as srs. 
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5 A second sample analysis: Italian denominal verbs of 
removal 


It will be useful to illustrate underspecification and its resolution with an example from 
derivational morphology. I will briefly present the analysis of the denominal verbs de- 
rived by s- prefixation in Italian as proposed by von Heusinger & Schwarze (2006).° 

The morphological process generates verb stems from noun stems by prefixing the 
constituent s- to the noun stem;Ó cf. (3) and (4): 


(3 a. crem(a)N — screma(re)y 
‘cream’ ‘to skim’ 


b. carcer(e)g — scarcera(re)y 
‘prison’ ‘to release from prison’ 


(4) a. La mattina, la nonna scremava il latte. 
the morning the grandmother skimmedywperrecr the milk 


‘In the morning, Grandmother used to skim the milk’ 


b. Il giudice ha scarcerato Giovanni Rossi. 
the judge has released-from-prison Giovanni Rossi 


"Ihe judge released Giovanni Rossi from prison: 


Both verbs, scremare and scarcerare, mean 'x removes y from z'. However, they differ 
with respect to the role of the nominal base in the verbs’ meaning. In terms of Leonard 
Talmy's (1985) lexical typology of motion events, the entity denoted by the base noun 
may be the Figure or the Ground. In (4a) the cream (crema) is the Figure; it is removed 
from the milk (/atte), which is the Ground. Inversely, in (3b) the prison (carcere) is the 
Ground, from which Giovanni Rossi, the Figure, is released. Thus the speaker needs to 
decide on the assignment of Figure and Ground for every single verb generated by N—V 
s-prefixation. In a two-level semantics, sr will only state that the verbs under discussion 
denote caused motion, the role of the incorporated noun being left open. The general 
semantic form of these verbs may thus be written as (5):7 


(5) cause(x, become(Alocated(y, z)) E [N(y)vN(z)] 


The first part of representation (5), cause(x, become(-located(y, z))), is the lexical de- 
composition of the main feature of all verbs of removal, remove(x, y, z). The second part, 


SGiuseppina Todaro (2017) applies the von Heusinger & Schwarze (2006) approach to prefixed deadjectival 
verbs in Italian. 

léNotice that Italian also has a VV s-prefixation, which derives verbs of reversal, see Mayo et al. (1995: 
932), among others. This is a different morphological process, which I do not discuss here. 

V'In von Heusinger & Schwarze (2006) the representation given here as (5) is not the final version, which uses 
indices in order to account for the correlation between ambiguity of role assignment and the alternative of 
quantification. In fact, if the predicate of the base noun is incorporated in the verb, it is only existentially 
bound by 3. If it becomes the direct object, it is bound by the A operator. In (5), quantification is omitted 
for the sake of easier reading. 
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N(y)vN(z), expresses the underspecification of s-prefixed verbs of removal by a disjunc- 
tion, where N is the predicate of the base noun. 

The ambiguity expressed by this disjunction is resolved at the cs level. According to 
von Heusinger & Schwarze's (2006) analysis, the resolution of the underspecification 
passes through the following phases: the concepts associated with the base noun predi- 
cates are looked up in cs and checked regarding their aptitude to be a Figure or a Ground 
ina motion event. Objects that may contain something, are apt to take the role of Ground, 
objects that may easily perform or undergo motion are apt to be the Figure. Some objects, 
such as a sheet of paper, may meet both criteria and may consequently motivate derived 
verbs with two alternative fully specified meanings. Italian scartare, derived from carta 
‘paper’, is such a case: it may be used as both a Ground verb or a Figure verb; cf. (6): 


(6) Mario scarta il regalo. 
Mario s-papersse ind pres the gift 
‘Mario takes the gift out of the paper’ Ground verb 
‘Mario takes the paper off the gift’ Figure verb 


Table 2 gives the rule that derives denominal s-prefixed verbs of removal in the LLM 
format, with the semantic layer formulated in such a way as to generate underspecified 


sF representations./?.? 


Table 2: The rule for deriving Italian denominal s-prefixed verbs 


c-structure [s]v prefix + Nstem — [SNstem]v. stem 

f-structure ([ PRED1)= P? — (f rnED2)- REMOVE (Î sun! OBJ), (T OBL)’ 
p-structure «no morphologically relevant change» 

s-structure — z(y) — cause(x, become(^located(y, z))) E [N(y) v N(z)] 


6 A third sample analysis: French N—V conversion 


I will now present a case study of French N—V conversion, as exemplified by the pairs 
in (7): 
(7) a. amidon - amidonner 
‘starch’ ‘to starch’ 


b. archives - archiver 
‘archives’ ‘to archive’ 


18For easier reading, I do not express here the case-marking of the Oblique, which must be ne if its predicate 
is ‘pro’ and must be marked by preposition da elsewhere. 

The [s] vs. [z] realization of the prefix is a matter of post-lexical phonology, hence it is not expressed in 
the morphological rule. 
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c. béche - bécher 
‘spade’ ‘to dig’ 

d. sucre - sucrer 
‘sugar’ ‘to sugar’ 


The relation between the nouns and the respective verbs in (7) is clearly directed, which 
does not hold for other noun-verb pairs, as those given in (8): 


(8) a. chant - chanter 
‘song’ ‘to sing’ 
b. gel -geler 
‘frost’ ‘to freeze’ 
c. prêt — prêter 
‘loan’ ‘to lend’ 


d. vent - venter 
‘wind’ ‘to be windy’ 


The difference between (7) and (8) is due to the ontological class of the nouns’ predi- 
cates: whereas the nouns in (7) denote objects or substances and thus are clearly distinct 
from the respective verbs, those given in (8) denote events or results of events and thus 
are not clearly distinct from the verbs they relate to. The derivational direction in (7) 
clearly is N—V, because event predicates may be built upon object or substance predi- 
cates, but not inversely.2° On the contrary, the conversion in (8) may be the opposite, 
V—N,?! or non-directional, NV, because the nouns’ and the verbs’ predicates are iden- 
tical or very closely related. 

As for the semantics of N—V conversion, I assume that the rule defines an underspec- 
ified semantic form, from which full meanings are derived by a retrieval of conceptual 
structure.?? To account for actual meanings that are not predicted on this basis, post- 
morphological processes are taken into account. It is also assumed that there are certain 
verbs that look like N—V converts, but are idiosyncratic items not derived by the rule. 


? Cf the more explicit formulation by Tribout (2010b: 140): *... le recours aux propriétés sémantiques des deux 
lexémes pour déterminer l'orientation de la conversion repose, par exemple, sur l'idée que le lexéme dérivé 
est nécessairement défini par le biais de son lexéme base, tandis que le lexéme base est sémantiquement 
indépendant de son lexéme dérivé. Ainsi pour la paire CLOU~CLOUER, CLOUER est nécessairement défini 
relativement à cLou comme ‘faire quelque chose avec des clous’ tandis que crou est défini comme un 
petit objet pointu, indépendamment de clouer. Cette asymétrie dans la relation sémantique entre les deux 
lexémes permet de prédire une orientation de la conversion de nom à verbe." 

For a state-of-the-art discussion on the direction of the French N—V vs. V—N conversion see Tribout 
(20102: 348-356). 

?"Tribout (2010b: 284-290) criticizes the underspecification approach; instead she proposes and spells out 
a fully specified semantics, based upon a classification of the output verbs. I am trying to show that an 
underspecification-based analysis of the French N—V conversion is an achievable goal. 


494 


19 Layered Morphology and Two-Level Semantics 


6.1 A database 


As a descriptive basis for the study, I established a database of 170 verbs that clearly 
are N—V converts. 19 of these verbs are prefixed and have no lexicalized unprefixed 
counterpart, such as emprisonner ‘to imprison’. 

I consider including prefixed verbs of this kind as legitimate, because the prefixes in- 
volved, en-, dé- and re-, require a verbal base. Emprisonner, e.g., thus has the derivational 
history shown by (9): 


(9) prisony — prisonnery — emprisonnery 


In addition to the verbs and their base nouns, the database contains the following 
information: 


e The verb's underspecified semantic form (sr), if there is one 
e The specified semantic representation (SR) 

e The conceptual class of the base noun 

e The presence of a prefix, if there is one 


* Remarks on formal and semantic properties of the derived verbs 


6.2 The underspecified semantic forms 


Underspecified semantic forms could be construed for 142 of the 170 verbs. The predom- 
inant one, which holds for 136 of the 170 verbs contained in the database, states the 
following: 


* The verb denotes an event, which is an action 
e It involves an agent and a theme 


* The denotation of the noun from which the verb is derived is a salient 


component of the action 
For an illustration, see example (10): 


(10) Le secrétairea archivé la correspondence. 
the secretary has archived the correspondence 


"Ihe secretary archived the correspondence: 


The sr underlying (10) states that the sentence describes an action. The denotation of 
the noun archives is a salient component of that action. The verb, archivé, has two argu- 
ments, le secrétaire and la correspondence, whose roles are agent and theme respectively. 

In addition to the predominant sr, two more srs have been identified; they are closely 
related to the predominant one, see examples (11) and (12). (11) describes an action, but 
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unlike (10), the verb has no argument in the role of theme. (12), where the reflexive 
pronoun is the operator of the middle voice, describes a process, the verb’s only argument 
is in the role of theme. 


(11) Nous avons passé l’aprés-midi à magasiner. (Canadian Fr.) 
we have spent the afternoon to window-shopping 


"We spent the afternoon window-shopping: 


(12) Leurs genoux se sont ankylosés. 
their knees REFLEXIVE_PRONOUN are ankylosed 


‘Their knees have become stiff. 


All srs assumed for the verbs contained in the database are shown in Table 3, which 
also shows the forms of the semantic predicates involved, the mapping of the arguments 


onto grammatical functions and the number of verbs for each sr.?? 


Table 3: Underspecified semantic forms of converted denominal verbs 


Semantic Grammatical SF verbs 
predicate functions 
srl  P(ex,y) x accomplishes an action on y, Nis 136 
agent(x) P (suBJ), (OBJ) salient in that action. 
theme(y) | 
x Y. 
SF2 P(e, x) x accomplishes an action. N is 4 
agent(x) P (SUBJ) salient in that action. 
x 
SF3 P(e, x, y) x undergoes a process. N is salient 4 
theme(x) P (SUBJ) in that process. 
x 


We can now formulate the rule for French N—V conversion, see Table 4.24 At the 
semantic layer, only the predominant sr is given. 


There are two questions that I cannot address here in detail. First, how productive is the process analyzed 
here? French is a language that overwhelmingly prefers affixation to conversion. I assume that N—V is 
fully productive, but that much of its output is blocked by the output of competing rules of affixation. 
Second, can the non-dominant srs be derived from the predominent one? Further research is needed here. 

"Except the selection of alternative lexicalized stem variants, see fn. 8. In the table I omit quantification 
again in order to make reading easier. 
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Table 4: The layered rule for N—V conversion 


c-structure Netem —Vstem, 1* inflectional class 

f-structure (Î PRED)=‘P1’ — (Î PRED)= P2 (T susj).( ont 

p-structure «no morphologically relevant change» 

s-structure  pl(z) — p2(e,x,y) ^ agent(x) ^ theme(y) A 
salient component of(e)- pl(y) 


7 Resolving the underspecified semantic forms 


As has already been pointed out, the underspecified semantic forms cannot be used in 
discourse, because they are unable to refer to the specific actions denoted by the verbs. 
Hence the underspecification needs to be resolved. This happens by accessing the con- 
ceptual knowledge associated to the base nouns. Regarding N—V conversion, I assume 
that the speaker or hearer looks up the conceptual knowledge associated with the noun, 
inspects the event types in which the noun's denotation is typically involved, and finally 
creates a new semantic predicate in which one of these event types is, so to speak, incor- 
porated. The noun's meaning is then turned into a feature of the new predicate, a feature 
that becomes visible by lexical decomposition. I will try to illustrate this idea by means 
of two examples, the first is (13): 


(13) L’ orfèvre a ciselé leurs noms sur les alliances. 
the goldsmith has chiseled their names on the wedding rings 


"Ihe goldsmith engraved their names on the wedding rings: 


The verb contained in (13) has the general, underspecified semantic form listed as sF1 
in Table 3, and repeated here as (14): 


(14) Xaccomplishes an action on y; N is salient in that action. 
For ciseler ‘to chisel, to engrave’ we replace N with “a chisel”, getting (15): 
(15) X accomplishes an action on y, a chisel (Fr. ciseau) is salient in that action. 


The conceptual knowledge associated with ciseau contains, among others, the infor- 
mation given under (16): 


(16) A chisel is a tool, used for cutting wood, stone or metals. 


The predicate cut(x, y) is the semantic counterpart of the concept of cutting. Go- 
ing back from conceptual structure to semantic form, the speaker inserts it into the 
decomposed semantic representation of the new predicate created by the conversion 
rule. The meaning of the new predicate also contains chisel(x), taken from the base 
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noun. Since, according to (16), a chisel is a tool, i.e. an instrument, the feature will be 
instrument used(e, x, y)-chisel(z). Notice that z is not an argument of the new predi- 
cate and will not be realized in the sentence. (17) is the assumed semantic representation 
of ciseler, after the resolution of underspecification. 


(17 de Ax Ay chisel(e, x, y)? 
event type(e) = action(e) 
action type(e) = cut(x, y) 
agent(e) = x 
theme(e) = y 
instrument used(e) = chisel(z) 


The first line of (17) gives the semantic representation of ciseler in the standard no- 
tation. The remaining lines give its decomposed meaning in terms of features, written 
as equations, in the tradition of unification grammars. (This notation mainly shows its 
usefulness when larger sections of the lexicon are analyzed: it makes it easy to express 
feature inheritance, and it helps to control the consistency of the features declared.) 

The second example I give for the resolution of underspecification is (18): 


(18) Leschasseurs ont huilé leurs fusils. 
the hunters have oiled their shotguns 


"Ihe hunters oiled their shotguns’ 


The verb huiler ‘to oil’ has the same sr as ciseler. Applied to the base noun huile ‘oil’ 
it reads: 


(19) X accomplishes an action on y, oil (Fr. huile) is salient in that action. 


Accessing the conceptual knowledge associated with huile, the speaker gets, among oth- 
ers, the information given under (20): 


(20) Oilis a substance used to lubricate a mechanism. 


The predicate lubricate(x, y) is the semantic counterpart of the concept of lubricating. 
The speaker inserts it into the decomposed semantic representation; the meaning of the 
new predicate also contains oil(x), taken from the base noun. Since, according to (20), 
oil is a substance, the feature will be substance used(e, x, y) = oil(z). (21) is the assumed 
semantic representation of huiler: 


(21) de Ax Ay oil(e,x, y) 
event type(e) - action(e) 
action type(e) = lubricate(x, y) 
agent(e) = x 
theme(e) = y 
substance_used(e) = oil(z) 


For readers not familiar with the French language, I use English to name semantic features, even though 
this may make the analysis somewhat inaccurate. 
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7.1 Polysemy in lexical morphology 


The conceptual categorization of ‘oil’ I assumed for the above sample analysis, i.e. that 
‘oil’ is a substance used to lubricate a mechanism, is far from being the only one.”° As 
we know, oil also is used to preserve wood or iron, to cook and season food, it also is 
a fuel, and an ingredient of oil paint. As linguists, we do not have scientific methods to 
find out to what extent knowledge of this kind is contained in the conceptual structure 
and we have no precise knowledge of how conceptual structure is processed during the 
resolution of semantic underspecification. However, we can look at the lexicon and see 
those elements of conceptual structure that show up in the lexical meanings of a given 
language. Thus we can observe that, in the meaning variation of the French verb huiler 
‘to oil’ the following bits of information clearly play a role: 


i. Oil is a lubricant (22), repeated from (18). 
ii. Oil is a preservative (23). 


(22) Les chasseurs ont huilé leurs fusils. 
the hunters have oiled their shotguns 


‘The hunters oiled their shotguns’ 


(23) Cettetablea besoin d’ étre huilée. 
this table has need tobe oiled 


‘This table needs to be oiled. 


As to using oil for preparing or seasoning food, the situation is less clear. According 
to the reviewer of this article, whom I believe to be a native speaker of French, huiler 
cannot mean ‘to season with oil’. I briefly searched the Internet and found out that there 
were zero hits for huiler la viande (viande means ‘meat’) and huiler les steaks. There were 
several hits for huiler la salade, but only two of them were from real text (24) and (25), 
the others being citations from dictionaries. 


(24) J'aime faire des vinaigrettes qui ne font pas qu'assaisonner ou huiler la salade 
mais qui apportent plutôt une valeur ajoutée. 


‘I like to make vinaigrettes that do not only season or oil the salad but rather 
bring an additional value’ 


(25) Ne pas huiler la salade, car ainsi suivant son goüt chacun fera sa propre 
vinaigrette, et puis s'il reste de la salade, elle se conservera plus facilement sans 
vinaigrette.?8 


‘Don’t oil the salad, because that’s how everyone will make their own vinaigrette 


to their taste, and then, if some salad is left over, it will be preserved more easily 
without vinaigrette” 


26] inserted this section as a response to a comment I received from an anonymous reviewer. For the analysis 
of polysemy in lexical morphology, also see Schwarze (2012). 

27http://brutalimentation.ca/2017/01/14/salade-festive-vinaigrette- digestive [2017-08-29]. 

*8http://ilovecuisine.blogspot.ch/2013/09/ma-salade-de-lete-la-salade-nicoise.html [2017-08-29]. 
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The remaining known uses of oil do not seem to play a role in the meaning variation 
of French huiler. Instead of speculating about why this should be so, let us pass on to a 
question that immediately arises from what we could observe. 

Assuming that the accessible conceptual structure offers competing information for 
the resolution of the underspecified meaning generated by the morphological process, 
the full meaning of huiler shows the following variants: 


(26) a. ‘To lubricate with oil’ 
‘To preserve with oil’ 


c. ‘To prepare or season with oil’ 


The question now is: How do speakers pick out the convenient reading in producing or 
parsing utterances? This is a very general question, not specific to the semantics of word 
formation. In the case of transitive verbs such as huiler, a sort of semantic agreement is 
at work, which checks the compatibility of the verb’s reading with the conceptual class 
of the direct object. 

Regarding the avoidance of huiler with a direct object denoting meat, there may be 
practical reasons or no reason at all; there are phenomena in verbal behavior that are 
beyond the reach of linguistic analysis. 


8 Restrictions on the input 


It can easily be seen that many nouns are not fit to be a base in the French N—V con- 
version. In a list of the first 100 non-eventual nouns contained in the Petit Larousse, only 
two are a base of NV converts, and only one of them, acier ‘steel’, is the stem of a verb 
with a transparent meaning, aciérer 'to cover with steel’.2? Notice, however, that this 
finding rests on a very weak empirical basis. The nouns considered are very few, and 
the data are limited to strongly lexicalized items. More research is needed to get reliable 
quantitative results. So I will just characterize the database with respect to the 143 nouns 
that are the base of verbs with a transparent meaning. Turning these observations into 
well-founded constraints and disentangling grammatical constraints on the input and 
conditions for use and lexicalization of the output must be left to further research. 

The following semantic characteristics of the base nouns can be gathered from the 
database: 


+ Most base nouns denote an instrument (42 items),?? a substance (36 items), a con- 


tainer (seven items), or a body part (nine items); see Tables 8 to 11 in the Appendix. 


2The other, abime ‘abyss’, has abimer ‘to damage’ as a convert, but that verb has a meaning that does not 
seem to be derived in a straightforward way from the noun's meaning. 

39 Cf. “Les verbes converts instrumentaux sont parmi les plus nombreux. Ils sont mentionnés dans toutes les 
études portant sur la conversion et sont généralement définis comme signifiant ‘utiliser N’, selon le schéma 
... X utiliser Nb" (Tribout 2010b: 263). 
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e Only one noun, enfant ‘child’, denotes a human being. The derived verb, enfanter 
‘to give birth to’, is infrequent and strongly marked as belonging to the literary 
register. 


e Only two nouns denote an animal, raton ‘young rat’, and zèbre ‘zebra’. Ratonner 
‘to commit a racial attack (ratonnade) on North-African immigrants’ has no trans- 
parent semantic relationship to its base. As to zébre, the derived verb, zébrer ‘to 
stripe’, is only weakly transparent: rather than to the animal, it refers to a visual 
pattern, black stripes upon a white ground. 


Regarding the formal properties of the base nouns, short words are preferred: most 
of them are mono- or disyllabic, only three (ankylose ‘ankylosis’, courbature ‘ache, stiff- 
ness’, and magasin ’store’) have three and only one (photographie ‘photography’) has 
four syllables. 

Nouns consisting of one morpheme only are clearly preferred; only tambourin ‘tam- 
bourine' and photographie 'photography' may be segmented into morphemes. There are 
no agent nouns in -(at)eur and no quality nouns in —(i)té in the stems of derived verbs. 


9 Reduced or lacking transparency - construed lexemes 
in time 


The database contains several verbs whose relationship with the base noun is not fully 
transparent or not transparent at all. For none less than 25 of the 170 verbs, no under- 
specified semantic form could be identified, which means that the meaning of the base 
noun is not a feature of the derived verb, see the examples in (27): 


(27) a. fourrager fourrage 
‘to rummage through’ ‘forage’ 
b. fronder fronde 
‘to satirize' ‘slingshot; revolt’ 
c. gueuler gueule 
‘to yell, to bawl’ ‘mouth’ 


Ten verbs can be analyzed as having undergone some post-morphological change 
along one of the familiar paths of semantic change or variation, such as narrowing or 
widening an original meaning. Examples are shown in Table 5: 

A particular kind of incomplete semantic transparency of the converted verb is due 
the fact that, rather than the verb, the base noun underwent a change after the derived 
verb entered the mental lexicon. Examples are échafauder ‘to put up scaffolding’ and 
mitrailler ‘to machine-gun’. The base noun of échafauder, échafaud, does not mean ‘scaf- 
folding’ any longer, it means ‘executioner’s platform’ in modern-day French. The verb’s 
meaning came about when échafaud still meant ‘scaffolding’. Likewise, mitrailler ‘to 
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Table 5: Post-morphological change along some familiar paths 


N English V English kind of change 
fer ‘iron’ ferrer ‘to shoe (a horse)’, ‘to strike (a fish)’ narrowing 
jardin ‘garden’ jardiner ‘to do some gardening? narrowing 
mur ‘wall’ emmurer ‘to wall (a prisoner)’ narrowing 
ombre ‘shadow’  ombrer ‘to shade, to hatch’ narrowing 
peau ‘skin’ peler ‘to peel’ narrowing 
piste ‘trace’ dépister ‘to track down (a game)’ narrowing 
plume ‘feather’ plumer ‘to pluck (a bird)’ narrowing 
tapis ‘carpet’ tapisser ‘to decorate (a wall and similar)’ widening 


machine-gun’ was created when the noun, mitraille, still meant ‘machine gun’. Its mean- 
ing changed to ‘hail of bullets’, which lessened the semantic transparency of the derived 
verb. 

The formal transparency may also be obscured, ie. the noun’s stem may differ to 
some extent from the derived verb’s stem.?! The variation in such cases mostly is due to 
morphologization of a phonological variation existing at an earlier stage of the language 
and may be made less opaque by the existence, in modern French, of other examples 
that exhibit the same lexical variation. The variation between /o/ and /el/ or /al/ as in 
peau /po/ ‘skin’ - /pel/ ‘peels’ and peler /pale/ ‘to peel’ is such a case. Its transparency is 
improved by the presence of numerous items like those given in (28): 


(28) a. nouveau 
/nuvo/ 
'new.MAS' 


b. nouvelle 
/nuvel/ 
“‘new.FEM’ 

c. renouveler 
/ranuvale/ 
‘to renew’ 

d. niveau — niveler 
/nivo/  /nivole/ 
‘level’ ‘to level’ 


31For a complete list of the kinds of allomorphy involved in N—V conversion see Tribout (2010b: 114f). She 
argues that even totally opaque pairs such as pierre ‘stone’ and lapider ‘to stone’ may be analyzed as cases 
of conversion, because they are related by suppletion (Tribout 2010b: 110, 118). 
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But this is not always the case. See the right-most column in Table 6. 


Table 6: Stem variation in N-V pairs 


N English V English remarks 

ciseau /sizo/ ‘chisle’ ciseler /sizale/ ‘to chisle’ see (25) 

faux /fo/ ‘scythe’ faucher /fofe/ ‘to scythe’ isolated variation 
grain /ggč/ ‘grain’ engrener /ägsone/ ‘to engage with’ no transparency 
hiver /ives/ ‘winter’ hiverner /ivesne/ ‘to winter’ cf. jour -journée 
marteau /masto/ ‘hammer’  marteler /mastale/ ‘to hammer see (26) 

nœud /nø/ ‘knot’ nouer /nue/ ‘to knot’ cf. jeu-jouer 

poil /pwal/ ‘hair’ peler /pale/ ‘to peel’ cf. moi-me 

poil /pwal/ ‘hair’ épiler /epile/ ‘to depilate’ native vs. borrowed 
sang /sà/ blood’ saigner /sene/ ‘to bleed’ isolated variation 


Most of these cases of reduced or lacking transparency have originated from the de- 
velopment of the grammar combined with the effects of lexicalization. N—V conversion 
has been a persistent rule in a changing grammar. It was present at the Latin stage of 
the language (see Table 7), and endured throughout the centuries up to the present day, 
while there happened important changes elsewhere in the grammar. 


Table 7: N—V conversion in Latin 


N English V English 

cor cordis heart recordor to call to mind, to remember 
glacies ice glacio to freeze 

navigium vessel, ship navigo to navigate, to sail 

onus oneris cargo, burden, load onero to load, to burden 

pignus pignoris bet, stake, pledge pignoro to pledge 

pilum hair pilo to depilate 

pugnus fist pugno to fight 

sal salt salo to salt 

velum curtain, sail, covering velo to enfold, envelop, veil 


When speakers found it useful for communication, the output of the rule entered into 
usage and was lexicalized. This happened at various periods, when the meaning of the 
base noun could be different from today's, and when there was a regular phonological 
variation given up later. But the original forms and meanings could remain in the lexicon. 

Moreover, once a construed word has entered the mental lexicon, its meaning may de- 
velop freely, which leads to reduced or lacking transparency with respect to the original 
meaning, founded on some sr and its conceptual resolution. 
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What does that mean for the morphological process as a part of mental grammar? 
Remember that word formation rules are thought to have a double purpose: they cre- 
ate possible words, and they analyze existent words. Hence the N—V conversion rule 
will not create opaque or semi-transparent forms. However, as a means of learning and 
understanding construed lexemes, it will also cope with semi-transparent forms, to the 
extent that suitable variation patterns are present in the lexicon. Thus speakers will pre- 
sumably be able to relate ciseler /sizale/ to ciseau /sizo/ or marteler /maxtole/ to marteau 
/maxto/, because these pairs show a variation pattern that is also present elsewhere in 
the lexicon. In addition, a clear semantic relationship between the noun and the verb cer- 
tainly is a strong support to transparency. It would be interesting to see experimental 
research on this point. 
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Appendix 


The Appendix contains some tables that would have disturbed the reading process of the 
main text. 


Table 8: Verbs derived from nouns that denote a body part 


V English N English 
ciller to blink cil eyelash 
enculer to sodomize cul arse 
doigter to use one's fingers correctly on a piano and similar doigt ^ finger 
griffer to scratch griffe claw 
gueuler to yell gueule mouth 
manier to handle main hand 
peler to peel peau skin 
plumer to pluck (a bird) plume feather 
dépiler to depilate poil hair 
sourciller to raise one's eyebrows sourcil eyebrow 
talonner ` to follow someone's heels talon heel 
zyeuter to take a look at yeux eyes 
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V English N English 
ancrer to anchor ancre anchor 
arquer to curve arc bow 
basculer to topple over bascule seesaw 
bécher to dig with a spade béche spade 
boulonner to bolt boulon bolt 
brosser to brush brosse brush 
ceinturer to surround ceinture belt 
ciseler to chisle ciseau chisle 
claironner to shout from the rouftops clairon bugle 
clouer to nail clou nail 
cravacher to whip cravache whip 
crocheter to pick (a door, a lock) crochet picklock 
chainer to put on snow chains chaine chain 
faucher to scythe faux scythe 
filtrer to filter filtre filter 
flinguer to blow away, to shoot flingue gun 
flüter to produce a flute-like sound flüte flute 
fouetter to flog, to whip fouet whip 
fourcher to split fourche fork 
freiner to brake frein brake 
fronder to satirize fronde sling, revolt 
fusiller to shoot (a condemned person) fusil rifle 
hacher to chop hache ax 

griller to grill gril grill 

limer to file lime file 
marteler to beat, to pound marteau hammer 
menotter to handcuff menottes handcuffs 
miner to mine, to sap mine mine 
mitrailler to machine-gun mitraille hail of bullets 
peigner to comb peigne comb 
photographier to photograph photographie photography 
pilonner to bombard, to grind pilon pestle 
poignarder to stab poignard dagger 
raboter to plane rabot plane 
sabrer to cut down sabre sword 
scier to saw scie saw 
tambouriner to hammer, to drum tambourin tambourin 
tamiser to sieve tamis sieve 
téléphoner to phone téléphone phone 

se tirebouchonner ` to be twisted, to be wrinkled tirebouchon corkscrew 
visser to screw on vis screw 
vriller to bore, to pierce vrille spiral 
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Table 10: Verbs derived from nouns that denote a substance 


V English N English 
aérer to air air air 
amidonner to starch amidon starch 
argenter to silver argent silver 
bétonner to concrete béton concrete 
beurrer to butter beurre butter 
bitumer to asphalt, to tarmac bitume bitumen 
charbonner ` to blacken charbon coal 
chiffonner | to crumple chiffon mousseline, rag 
cimenter to cement ciment cement 
cirer to polish (shoes, the floor) cire Wax 

crotter to muddy crotte dropping 
cuivrer to bronze, to copper cuivre copper 
émailler to enamel émail enamel 
fariner to flour farine flour 

ferrer to shoe (a horse) fer iron 
feutrer to felt feutre felt 
enfieller to fill with bile fiel bile, venom 
enfumer to fill with smoke fumée smoke 
gazer to gas gaz gas 

givrer to frost over givre frost 
goudronner to tarmac goudron tar 

graisser to grease graisse grease 
huiler to oil huile oil 

larder to lard lard fat streaky bacon 
pimenter to put chillies in piment hot pepper 
plastiquer to carry out a bomb attack on plastic plastic explosive 
plâtrer to plaster plâtre plaster 
plomber to fill (a tooth), to seal plomb lead 
poivrer to pepper poivre pepper 
poudrer to powder poudre powder 
rouiller to rust rouille rust 

sabler to sandblast sable sand 
saigner to bleed sang blood 
savonner to rub soap on savon soap 

saler to salt sel salt 

sucrer to put sugar in sucre sugar 
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Table 11: Verbs derived from nouns that denote a container 


V English N English 
archiver to archive archives archive 
cuver to ferment cuve tank 
engainer to put into its sheath gaine sheath 
engranger to gather in, to store grange barn 
emmagasiner to store magasin store 
emprisonner to imprison prison prison 
enregistrer to register registre register 
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Chapter 20 


Lexeme equivalence or rivalry of 
lexemes? 


Jana Strnadova 


Google, inc. 


This paper deals with the purported interchangeability between nouns and adjectives de- 
rived from nouns in French. The question of equivalence or rivalry between a morphologi- 
cally complex adjective and a syntactic construction containing a morphologically-related 
noun links a field of studies on rivalry between inflected word forms, derivational suffixes 
or different syntactic constructions to express the same meaning. This paper then presents 
a corpus-based study of the relative distribution of nominal or adjectival realizations of a 
modifier of the same head noun and discusses some motivations that play a role in the choice 
of one or the other strategy. 


1 Introduction 


Both in syntax and in morphology, the same content can be expressed by different struc- 
tural means. 

In syntax, this may take the form of valency alternations such as the English dative 
alternation (e.g. Mary gave a watch to me vs. Mary gave me a watch) or of word order alter- 
nations such as exemplified by the position of French attributive adjectives with respect 
to their governing noun. Such alternations have been the focus of much attention in the 
recent literature which focuses on establishing the interplay of various non-categorical 
factors (see e.g. Bresnan et al. 2007 on the dative alternations, Thuilier 2012 on French 
adjectives). 

In morphology, the consensus has long been that such alternations are inexistent 
or unexpected: in inflection, a unique form was assumed to fill each cell of a lexeme’s 
paradigm (Anderson 1992, Stump 2001), in word formation, rivalry between affixes was 
taken to be resolved by blocking (Aronoff 1976). This consensus has progressively col- 
lapsed in the last two decades. Under the impulsion of Thornton (2012), the phenomenon 
of overabundace, where multiple forms fill a paradigm cell, has become a central issue 
in inflectional morphology (see e.g. Bermel & Knittl 2012 for Czech noun declension, 


Jana Strnadová. Lexeme equivalence or rivalry of lexemes? In Olivier Bonami, 
Gilles Boyé, Georgette Dal, Héléne Giraudo & Fiammetta Namer (eds.), The lexeme 
| in descriptive and theoretical morphology, 509-525. Berlin: Language Science Press. 
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Stump 2016, Bonami & Crysmann this volume, Thornton this volume). Likewise, situa- 
tions of non-categorical competition between derivational processes have moved from 
the fringes (Rainer 1988, Plag 1999) to the center of attention for derivational morphol- 
ogists (Lindsay & Aronoff 2013, Villoing 2009, Tribout 2010, Fradin 2012, Koehl 2012, 
Namer 2013, Strnadova 2014). 

In this paper, I focus on situations of alternation between the morphological or syn- 
tactic expression of some content. This is familiar in the context of inflection where 
overabundance between synthetic and periphrastic expression of paradigm cells is well- 
documented (Aronoff & Lindsay 2014, Bonami 2015). For example, friendlier and more 
friendly are both realizations of the comparative degree of the lexeme FRIENDLY. Situa- 
tions in which a syntactic construction and a derivational process led to the expression of 
the same content have been comparatively less studied.! Here I will specifically examine 
the expression of nominal modification by a prepositional phrase containing some noun 
N or a denominal adjective derived from that same noun. This is illustrated in (1): the 
adjective grammaticale in (1a) and the noun grammaire introduced by the preposition de 
in (1b) roughly make the same contribution. 


(1) a. faute grammaticale 
‘grammatical mistake’ 


b. faute de grammaire 
‘grammar mistake’ 


The central questions that arise in view of such examples are 1) to what extent can 
the adjective and the prepositional phrase be taken to be semantically equivalent and 
2) whether the two constructions should be taken to be paradigmatic alternatives in the 
same way as friendlier and more friendly are. 


2 Background and methodology 


The proximity between a denominal adjective and a prepositional phrase containing a 
morphologically related noun was observed as early as Dumarsais (1769: 413): “When 
there is a simple preposition de, without an article, the preposition and its complement 
are considered adjectively. Un palais de roi, is equivalent to palais royal ‘royal palace’; 
une valeur de héros equals to une valeur héroïque ‘heroic value"? Bally (1944) used the 
term transpositions and Tesniére (1969) called this kind of adjectivisation translations. 
The idea of equivalence between the two constructions was discussed later for exam- 
ple by Bosredon (1988) or Bartning & Noailly (1993), or in a more semantic approach, by 


ln French, for example, the topic of possible competition between morphologically complex words and 
syntactic phrases has been studied for causative verbs (Dal & Namer 2003). 

?Orig. "Lorsqu'il n'y a qu'une simple préposition de, sans l'article, la préposition et son complément sont 
pris adjectivement. Un palais de roi, est équivalent à un palais royal; une valeur de héros équivaut à une 
valeur héroique? 
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Nowakowska (2004) and Roché (2006: 380), who insists on the equivalence by describing 
“the adjectivized noun lexically as it can be syntactically with the preposition de"? 

Functional and semantic equivalence between a denominal adjective and its base noun 
used in a prepositional phrase is thus considered as one of the characteristics of denomi- 
nal adjectives. The examples (1)-(3) show the possibility to substitute a derived adjective 
with a prepositional phrase. 


(2) a. le climat social 
‘social climate’ 


b. le climat de la société 
‘climate of the society” 


(3) a. secret de famille 
‘family secret’ 


b. secret familial 
‘family secret’ 


The question is then to what extent are prepositional phrases functionally and se- 
mantically equivalent to denominal adjectives in French? This question was of central 
importance in the 1980s and 1990s. At that time, the interest focused on the argument 
realization of the head noun with the goal of defining the syntactic and semantic rela- 
tions within a noun phrase (Bartning 1980, Pinchon 1980, Monceaux 1993, etc.). These 
works showed that adjectives and prepositional phrases are not equivalent and are not 
interchangeable without any restriction. 

More recently, Deléger & Cartoni (2010) studied the use of an adjective or of its corre- 
sponding prepositional phrase in specialized or general medical corpora and showed that 
there is a preference for the use of adjectives in specialized texts, while corresponding 
prepositional phrases are more frequent in non-specialized texts (4). 


(4) a. rythme cardiaque 
‘cardiac rhythm’ 


b. rythme du coeur 
‘heart rhythm’ 


Finally, Boleda et al. (2012) provided some statistical evidence supporting the claim 
that an ethnic adjective, which is in a certain way a denominal adjective, cannot be 
interpreted as the argument of the noun as in (5). The adjective acts as a simple modifier. 
In their study, the modified noun is a predicative noun. 


(5 a. French agreement 


b. agreement by France 


3Orig. “le nom adjectivé lexicalement comme il peut l'étre syntaxiquement par la préposition de”. 
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All these studies have one thing in common: they do not differentiate between cases 
where the prepositional phrase contains a fully determined NP and those where it con- 
tains just a bare noun. In (6), the adjective gouvernementale is in competition with the 
prepositional phrase containing a definite noun phrase (le gouvernement*), while in (7), 
the preposition governs a bare noun (publicité). Semantically, in (6), the noun phrase 
within the PP refers to the cabinet, while the noun phrase in (7) doesn’t refer to an ad- 
vertisement. 


(6) a. décision gouvernementale 
‘governmental decision’ 


b. décision du gouvernement 
the government’s decision 


(7 a. campagne publicitaire 
'advertising campaign' 

b. campagne de publicité 
'advertising campaign' 


Contrary to these previous studies, I examine denominal adjectives and their syntactic 
equivalents with the restriction on prepositional phrases containing a bare noun intro- 
duced by the preposition de. In such cases, the noun does not head a referential expres- 
sion. Note that this restriction entails that the investigation be limited to cases where 
the adjective is derived from a common noun, as exemplified in (8). Adjectives derived 
from proper names are excluded since the proper names being definite noun phrases are 
referential expressions. 


(8) campagne de publicité / publicitaire 


Three situations must be distinguished concerning the availability of a denominal ad- 
jective corresponding to a French noun: (i) there is an adjective regularly derived from a 
noun (9); (ii) there is an adjective with a formal mismatch in comparison with the noun 
(10); (iii) there is no adjective (11) and hence a prepositional phrase is the only possible 
realization of the modifier (12). 


(9) PUBLICITÉ ‘advertisement’ — PUBLICIT-AIRE ‘advertising’ 
(10) LANGUE ‘language’ ~ LINGUISTIQUE / *LANGUIQUE 


(11) a. DÉCOLLAGE ‘take-off’ — ? 
b. ARRIVÉE ‘arrival’ — ? 


C. SECOURS 'emergency' — ? 


(12) a. piste de décollage 
‘runway’ 


"The definite article le is merged with the preposition de which results in du gouvernement. 
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b. hall d’arrivée 


‘arrival hall’ 
c. issue de secours 
‘emergency exit’ 


It is notable that languages differ in this respect. As Table 1 shows, Czech tends to 
have available denominal adjectives where French does not. English has the same gap as 
French but uses compounding rather than PP modifications as an alternative strategy. 


Table 1: Comparison between French, Czech and English noun phrases 


French Czech English 
Ny deN3 Aq Ny NN: 
hall ` d'arrivée příjezdová hala arrival hall 
issue de secours nouzový východ emergency exit 
carte de crédit kreditní karta credit card 
piste de décollage vzletová dráha runway 


To study the rivalry between denominal adjectives and prepositional phrases, the fol- 
lowing resources were used: 


1. A lexicon of noun-adjective pairs from DenALex (Strnadová & Sagot 2011) and 
Lexique3 (New 2006). 5,888 noun-adjective pairs with regularly derived adjectives 
and 234 noun-adjective pairs with a formal mismatch were obtained in this way. 


2. The corpus Est républicain which covers three years of a local newspaper (1999, 
2002, 2003) and contains 119.5 million word tokens with morphosyntactic annota- 
tion (Seddah et al. 2012). 


Table 2 illustrates the diversity of denominal adjectives contained in the lexicon. 

The following methodology was applied: search in the corpus for all combinations 
where a noun is followed by an adjective from the lexicon or by a prepositional phrase 
with de containing a noun from the lexicon (13). 


(13) lexicon entry: PUBLICITÉ - PUBLICITAIRE ‘advertisment - advertising’ 
corpus search: Xy publicitaire 


corpus search»: Xy de publicité 


Hp SP 


search result: campagne de publicité, campagne publicitaire, etc. 


The vocabulary used throughout this article can be defined as follows: N4 is the mod- 
ified noun or the head noun. Ag is the modifying denominal adjective. N is the noun 
morphologically related to the adjective Ag. The term combination stands for the search 
results N4 Ag and N; dek: In each combination, deN stands for the nominal realization 
and A, for the adjectival realization of the modifying concept N. 


513 


Jana Strnadova 


Table 2: Sample of French Denominal Adjectives 


Suffix Noun Adjective 

-aire CELLULE ‘cell’ CELLULAIRE ‘cellular’ 

-al PARENT ‘parent’ PARENTAL ‘parental’ 

-el CULTURE ‘culture’ CULTUREL ‘cultural’ 

-esque CARNAVAL ‘carnival’ CARNAVALESQUE ‘of carnival’ 
-eux ANGINE 'angina' ANGINEUX ‘anginal’ 

-ien MICROBE ‘microb’ MICROBIEN ‘microbial’ 

-ier COTE ‘coast’ COTIER ‘coastal’ 

-ique MÉTHODE ‘method’ METHODIQUE ‘methodical’ 
-u FEUILLE ‘leaf’ FEUILLU ‘leafy’ 


For each triple (N4, Ag, No), I computed the frequency F4 of the Ni Aq of the noun- 
adjective sequence, the frequency F> of the N1deN2 sequence, their sum frequency 

F 
Sim 
For instance, for the triple (campagne, publicitaire, publicité), the corpus contains 40 oc- 


SumFreq = F; + F, and the relative frequency of the Ni Au sequence, Rfreq = 


currences of campagne publicitaire and 27 occurrences of campagne de publicité; hence 


SumFreq = 67 and Rfreq = 10x27 = 0.6. 


3 Corpus-based results 


A first study focused on the pairs containing a regular denominal adjective, i.e. there is 
no formal mismatch between the noun and the adjective except for the suffix. 139,838 
types of combinations (out of 1,137,137 occurrences) were collected. 45% of nouns (2,686 
lexemes) from the lexicon were attested in the corpus. Likewise, 30% of adjectives (1,708 
lexemes) were attested. Incomplete attestation was to be expected, since the lexicon con- 
tains many scientific terms which are not found in a journalistic corpus and many types 
have a very low frequency anyway. 

The data distribution is presented in Table 3. 

There is an inverse correlation between the token frequency of the triple (SumFreq) 
and the proportion of cases where both strategies are attested. In particular, whereas 
only 4% of triples are attested in both strategies overall, this proportion rises to 26% for 
triples with a SumFreq above 1,000. 

For the rest of the study, only the types with a sum frequency above 10 were taken into 
account. At this threshold, there are 17% of cases which can be realized either as an adjec- 
tive or as a prepositional modifier and which are then possible rivals. This corresponds 
to 937 different nouns covering 16% of the lexicon and 659 adjectives corresponding to 
11% of the lexicon. 
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Table 3: Type counts of N,A, and N,deN, combinations by sum token fre- 
quency of the triple 


SumFreq All Only NA4 Only N,deN, Both % 


z0 139,838 70,876 63,145 | 5,817 =4% 
210 13,422 6,175 4,986 2,261 =17% 
2100 1,586 687 535 364 =23% 


21000 100 45 29 26 =26% 


46% of cases only have an adjectival realization for the same head noun and 37% of 
combinations only have the nominal realization. This leads to a U-shaped distribution 
with many cases at the edges and few cases in the middle of the distribution, what Zuraw 
(2016) calls a "polarized distribution". If denominal adjectives and prepositional phrases 
were in free variation, then many more cases would be expected in the middle of the 
distribution. 

Table 4 shows the number of types in each interval of the distribution. As can be seen, 
many cases have a strong preference for one or the other realization. There are only 154 
types with a relative frequency between 0.4 and 0.6, which could be described as real 
cases of free variation. I will call pairs having such a distribution strong rivals. 


Table 4: Distribution of relative frequencies of triples <N}, A4, Nj) with Sum- 
Freq 210 


Rfreq interval ` of data # of types 


0 < Rfreq < 1 17% 2,261 types 
0.2 < Rfreq < 0.8 5% 580 types 
0.4 < Rfreq < 0.6 1% 154 types — Strong rivals? 


The U-shaped distribution of relative frequencies for triples is shown in Figure 1. In 
order to make the figure readable, only data points with SumFreq = 20 and 0 < Rfreq < 
1 are shown. If no threshold was used, the edges would be much higher as most of the 
cases prefer one or the other realization. 

Table 5 presents examples for the whole spectrum of relative frequencies, ranging 
from a strong preference for the adjectival realization at the top (Rfreq = 0.93 for the 
triple (spectacle, musical, musique)) to a strong preference for the nominal realization 
at the bottom (Rfreq = 0.06 for the triple (commission, disciplinaire, discipline)). 
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400 4 


300 4 


200 4 


Number of types 


100 - 


0.0 0.2 0.4 0.6 0.8 1.0 
Proportion of NA 


Figure 1: Distribution of relative frequencies of triples (N,, Ay, N;) 


Table 5: Examples of N,A,/N,deN, combinations with their frequencies 


N, A,/deN, Freq Rfreq Translation 
musical 409 0.993  , . , 
spectacle : music show 
de musique 31 
héologi 640 EM. | , 
musée SC dicc d 9 archaeological museum 
d'archéologie 69 
; électrique 333 UM. t | 
lectrical grid” 
réseau d'électricité ay electrical gri 
théâtrale 347 0.5 : 
t ‘theatrical t : 
roupe GE 547 eatrical troupe 
. . critique 47 UM. . ; due 
situation d critical situation 
de crise 78 
leil automnal 21 016  . E: 5 
solei EE igo automn sun 
commission disciplinaire D noe ‘disciplinary committee’ 
de discipline 226 PEAU 


Table 6 shows some examples which could be considered in free variation between Ag 
and deN, since the relative frequency is situated between 0.4 and 0.6. For triples such as 
<féte, familial, famille) or «troupe, theâtral, theatre), adjectival and nominal realizations 
are equivalent. 

These strong rivals are distributed across all suffixes, as shown in Table 7 which con- 
tains a couple of adjectives which compete with their corresponding nouns introduced 


by de. 
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Table 6: Examples of strong rivals (0, 4 < Rfreq < 0, 6) 


Ni Agq/deN2 Freq Rfreq Translation 
4 familiale 102 0.42 , | e 
t famil t 
gne de famille 143 SERI BOUND! 
hot hi 185 M^. segs! ls 
exposition ei : e Ee id 258 d photography exhibition 
2. talentueux 38 044 . SN 
musicien talented musician 
de talent 48 
troupe ee e 28 ‘theatrical troupe’ 
campagne Donee 2 8 ‘advertising campaign’ 
pag de publicité 27 8 pug 
olitique una , Wë ‘security policy’ 
poung de sécurité 24 y poacy 
Table 7: Examples of strong rival adjectives sorted by suffix 
Suffix Examples of adjectives Count 
ÉTAIRE ‘budgetary’ 
P" BUDGÉTAIRE budgetary’, $i 


SECURITAIRE ‘security’ 


ARCHITECTURAL ‘architectural’, 
-al AUTOMNAL ‘autumnal’ 51 
MUSICAL ‘musical’ , FAMILIAL ‘family’ 


CONCURRENTIEL ‘competitive’, 


-el : | ; 20 
PROMOTIONNEL 'promotional 

-esque CARNAVALESQUE ‘carnaval’ 1 

d ARGILEUX ‘clay’, ORAGEUX ‘stormy’ va 
GLORIEUX ‘glorious’, PRESTIGIEUX ‘prestigious’ 

-ier LEGUMIER ‘vegetable’, PRINTANIER ‘spring’ 11 


ARCHÉOLOGIQUE ‘archaeological’, 
-ique INFORMATIQUE ‘information, 36 
TOURISTIQUE ‘touristic’ 
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As has been shown, the number of cases where both realizations receive the same 
preference is rather low. 

Remember that we focused for now on cases where the formal relationship between 
the denominal adjective and its base noun is straightforward. One might expect to find 
different results where the relationship is more opaque. This is not what we found with 
the lexicon containing 234 noun-adjective pairs with a formal mismatch. Table 8 presents 
the distribution of rivals in this category according to the type frequency and Table 9 
gives some examples of combinations with their frequencies. The results on this data set 
present a similar U-shaped distribution as we have seen in Figure 1. 


Table 8: Absolute frequencies of triples (N,, Aj, N3) in the corpus where A, 
has an idiosyncratic form 


SumFreq #oftypes Both realizations 


f20 29,884 1713 =6% 

f210 3,641 | 673 =18% 
f2100 582 140 =24% 
f21000 52 19 =36% 


Table 9: Examples of N,A, / N,deN, with absolute and relative frequencies 
where A, has an idiosyncratic form 


Ni Ag/deN, Freq Rfreq Translation 
pluviale 435 075 , . ; 
eau | rain water 
de pluie 149 
éclipse one Ges eee ‘solar eclipse’ 
p de soleil 79 P 
stage See an en ‘language course’ 
8 de langues 18 suas 
2 estival 13 014 , : ; 
loisir EP summer leisure 
d'été 81 


Overall, there are not many cases where the adjective and the noun are used to modify 
the same noun: We are far from a situation of interchangeability between the two. 


4 Discussion 


4.1 Grammar conditions 


The low number of strong rivals is certainly due at least in part to grammatical or se- 
mantic constraints. For example, the acceptability of the N; deN, realization is reduced 
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where Nj is a deverbal noun. A likely explanation is that prepositional complements of 
deverbal nouns tend to be interpreted as realizing an argument of the noun (14a), while 
adjectives can act as simple modifiers (14b). The same deN, is fine if the head noun is 
not deverbal (14c). 


(14) a. ?visite d'archéologie 
‘visit of archaeology’ 
b. visite archéologique 
‘archaeological visit’ 
c. laboratoire d’archéologie 
‘archaeological laboratory’ 


Another example of such constraints, but this time in favor of free variation, is repre- 
sented by quality nouns such as exception ‘exception’, prestige ‘prestige’, talent ‘talent’, 
etc. and derived qualifying adjectives, such as talentueux ‘talented’, prestigieux ‘presti- 
gious’, etc. In this case, both the PP and the adjective can be functionally equivalent as 
shown in (15). 


(15) a. musicien de talent 
‘talented musician’ 

b. musicien talentueux 
‘talented musician’ 


With this being said, there is a large residue of examples with preference for one or 
the other type of modifier without any clear grammatical motivation. I consider these 
to be a matter of usage-based conventionalization. Therefore, in (16), the very strong 
preference for the given alternative —383 versus 1 for (16a) and 62 versus 5 for (16b) — 
is only a matter of pure convention. In certain cases, a partial semantic specialization 
can be observed. This is the case for the “false rivals” in (17) which do not have the same 
meaning. 


(16) a. fourniture scolaire ‘school supplies’ / f = 383 
b. sac d’école ‘school bag’ / f = 62 


(17 a. sortie scolaire ‘school outing’ / f = 65 
b. sortie d’école ‘end of the school day’ / f = 73 


In conclusion, denominal adjectives and prepositional phrases with de are not in free 
variation. Some cases can be explained by grammar, but conventionalization seems to 
be an important factor which should be studied more in detail. 


4.2 Lexical conditions 


Looking at the distributions of modifiers, the choice between adjectival or prepositional 
modifiers seems notably conditioned by the lexical identity of the modifying concept. 
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Thus, if your modifier denotes ‘security’, there is a clear preference to use a PP de sécurité, 
while if your modifier denotes ‘region’, then the preferred modifier will be the adjective 


REGIONAL, as shown in Figure 2. 


de sécurité vs. sécuritaire de région vs. régional 
80 250 
n Ke n 
200 
2.60 2 
E E 
5 50 5 150 
x 40 ke 
E E 100 
Ee 5 
zo z sh 
10 
0 0 
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 
Proportion of NA Proportion of NA 


Figure 2: Distribution of relative frequencies of triples (N,, A4, N,) where N, 
= sécurité / région 


Each strategy has its own distribution. For example for the pairs THEATRE ‘theater’ 
/ THEATRAL ‘theatrical’ and MUSIQUE ‘music’ / MUSICAL ‘musical’, there is a real rivalry 
between the adjectival and the prepositional realization, as illustrated in Figure 3. 


de théâtre vs. théâtral de musique vs. musical 
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Figure 3: Distribution of relative frequencies of triples (N,, A4, N,) where N, 
- théátre / musique 


The four seasons, such as the example (18), can be presented as another good example: 
as shown in Figure 4, the use of a PP is much more frequent than the use of denominal 
adjectives which are commonly used only in a poetic register. 


(18) balade d'automne / automnale ‘autumn walk’ 


Thus, register can also play a role in the choice of the realization. This observation cor- 
responds to the conclusion of Deléger & Cartoni (2010) on medical texts where adjectives 
are more frequent in specialized texts than in more general texts. 
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de printemps vs. printanier d’été vs. estival 
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Figure 4: Distribution of relative frequencies of triples <N}, A4, N,) where N, 
is a season 


Another question is to know to which extent the choice of one or the other alternative 
is conditioned by the identity of the head noun. For example, the nouns zone 'zone' and 
concours ‘competition’ have equally distributed adjectival and prepositional modifiers, 
as presented in Figure 5. This would need to be assessed against the whole dataset tak- 
ing into account the semantic relationship between the head noun and the modifying 
concept, for example by relying on the principles of distributional semantics. 
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Figure 5: Distribution of relative frequencies of triples <N}, Aj, N,) where N; = 
zone / concours 
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This section has shown that denominal adjectives and prepositional phrases are sel- 
dom equivalent. First, free variation is rare. Then, the lexical identity of the pair noun 
~ adjective is decisive for the choice of the preferred realization. Finally, in many cases, 
this preference is purely conventional and cannot be explained in terms of grammar 
alone. 


5 Conclusion 


This paper questioned the purported equivalence between French denominal adjectives 
and morphologically related nouns embedded in prepositional phrases introduced by de. 
This idea has been present in the literature since at least the 18th century. In the same 
way as two word forms can fill the same cell of a lexeme’s inflectional paradigm or two 
dative constructions can alternate or a synthetic and a periphrastic form can compete 
for degree realization on adjectives and adverbs, there are two ways to express noun 
modification —with a denominal adjective or with a prepositional phrase introduced by 
de. Of course, there are other linguistic means that can be used for modification, but they 
have not been taken for granted to the same extent. 

I have shown that denominal adjectives and prepositional phrases are not in free vari- 
ation (sortie scolaire / sortie d’école). Instead, they have a U-shaped distribution with a 
majority of cases favoring one or the other strategy and only few cases in the middle of 
the distribution. In general, there is some usage-based conventionalization which is not 
written in any grammar rules but learned implicitly when learning the language. Some 
language register preference may also play a role. 

This paper presents a certain phenomenology of the question and the overview of 
what kinds of factors need to be taken into account and studied in more detail with re- 
spect to the choice between adjectival and nominal realization. Moreover, not only is it 
important to look into the rivals, but one also needs to look into the edges of the distri- 
bution: are there any specific constructions where the use of one or the other strategy 
can be predicted? A quick look at the data reveals that, for example, in combinations 
which favor nominal realization, there are cases where Nu is a deverbal noun and the 
noun embedded in the prepositional phrase saturates its argument structure (demande 
de soutien ‘request for support’, abandon de chien ‘dog abandonment’) or cases where N3 
is a deverbal noun and there is no adjective derived from it (horaire d’ouverture ‘opening 
hours’, issue de secours ‘emergency exit’). Another group that favors N; deN, are combi- 
nations where N, is a quantity or a measure noun (vingtaine de commerçants ‘twenty of 
shopkeepers’, tonne d'acier ‘ton of steel’). 

To conclude, both denominal adjectives and nouns embedded in prepositional phrases 
with de can be used as modifiers, but they usually do not have the same distribution or the 
same meaning. This brings us to a more theoretical question: could a prepositional phrase 
be considered as a possible candidate for the modifier cell of a derivational paradigm? As 
could be seen, especially nouns for which there is no corresponding derived adjective 
would have this cell empty for a synthetic form, but they could have it filled with a 
prepositional phrase. This could be considered as a sort of periphrasis, in a very similar 
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way as inflectional paradigms contain synthetic and periphrastic forms. The results of 
our corpus study suggest that extending this possibility to all lexemes would bring many 
new challenges. 
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The lexeme in descriptive and 
theoretical morphology 


Since the 1970s, the notion of a lexeme, an abstract lexical unit identifying what is common to a 
set of words belonging to the same inflectional paradigm, has become a cornerstone of theoretical 
thinking on morphology and a standard tool for description. The present volume collects papers 
that crucially use, discuss or question the lexeme in the context of contemporary morphology, 
with particular emphasis on its place in the description of word formation through the concept of 
a Lexeme Formation Rule. It will be of interest to any descriptive linguist, theoretical linguist, or 
psycholinguist with an interest in morphology and its interface with syntax and lexical semantics. 


